NeilBrown [Sat, 3 Oct 2020 10:49:53 +0000 (20:49 +1000)]
New release: v1.3
- performance improvements when comparing large files
- libwiggle.a can be built for other tools to use
- Makefile is quieter and can put binaries elsewhere
- fix a bug that corrupted the format of output in some cases.
NeilBrown [Sat, 3 Oct 2020 10:05:57 +0000 (20:05 +1000)]
Split non-static code out of wiggle.c
wiggle.c container some code that other files reference.
Now that this other file can live in a library (libwiggle),
move that code into a new file - util.c - so it is
available in the library.
NeilBrown [Sat, 3 Oct 2020 09:30:34 +0000 (19:30 +1000)]
merge: fix a problem with unmatchable hunks.
If the patch contains hunks that cannot be matched, wiggle cannot to
anything really useful, but sometimes it does something bad and produces
a badly formatted result.
In these cases, the hunk must get placed somewhere, and it is possible
that two hunks could line up with different places in the same line of
the original.
As wiggle normally expands conflicts to cover whole lines, to display
this coherently, you would need something like
<<<
chosen line
||||
hunk1- before
hunk2- before
===
hunk1- after
hunk2- after
>>>
But wiggle assumes that a hunk header marks the end of a conflict, so
bad things happen here.
I don't want to display the hunk header, so I think the best option is
<<<
chosen line
|||
hunk1- before
===
hunk1- after
>>>
<<<
|||
hunk2- before
===
hunk2- after
>>>
To achieve this, we don't stop collecting a conflict when we hit a
hunk-header moving forward, but when we print the conflict out, we
detect those hunk headers, and do the above.
If the final hunk doesn't actually contain any differences, don't
bother printing it.
If find_common takes more than 20msec to find the minimum edit distance,
take the next snake and assume it is good enough - and find sub-paths
before and after it. This gets a result quickly, but maybe not the
best result.
When we find a long snake, we significantly reduce the worst-case
result (which is "no more snakes ever"). This allows us to trim the
end-points of the search-font. If the endpoints have a best-case that
is worst than the worst-case when using the new snake, there is no point
pursuing them.
Previously we only trimmed the ends one step for each step forward.
This is unnecessarily cautious. It is better to keep trimming until
the best-case at the end reaches the worst-case.
The csl buff is currently allocated after the first pass, and filled in
during the second recursively-subdivided pass.
This means the two passes need to produce exactly the same result,
which makes it hard to introduce heuristics to cut corners on
big searches.
So change to allocating incrementally (in powers of 2) as needed.
NeilBrown [Sat, 29 Aug 2020 08:15:52 +0000 (18:15 +1000)]
Introduce --non-space option
This can significantly reduce the number of words by treating
punctuation as part of the surrounding word, rather than as single-char
words.
Fewer words can mean much faster comparisons.
NeilBrown [Sat, 29 Aug 2020 07:06:27 +0000 (17:06 +1000)]
diff: filter out unmatchable lines before comparing.
If either file has a run of 2 or more consecutive lines, none of which
appear in the other file, then reduce the run down to a single line.
This does not materially change the set of common-sub-lists that are
calculated, but it can make the calculate much faster when there is a
lot of unmatchable lines.
NeilBrown [Fri, 27 Dec 2019 04:58:27 +0000 (15:58 +1100)]
merge2: guard against over-flowing elcnt
It should be impossible to overflow elcnt in these
cases, but I have seen it happen.
There must be bug somewhere else, but for now, just prevent the
crash.
NeilBrown [Fri, 27 Dec 2019 04:31:48 +0000 (15:31 +1100)]
vpatch: call sort_patches() before main_window().
sort_patches() can reallocate the patch list array.
So after main_window is called (which calls sort_patches())
the patchs array might have changed. We current call
plist_free() on the old patch list, which can crash.
So instead, call sort_patches() before calling main_window(),
then call plist_free() afterwards, on the patch list
that sort_patches() returned.
This avoids the crash.
NeilBrown [Sat, 3 Aug 2019 02:11:56 +0000 (12:11 +1000)]
extract: allow blank lines in unified diffs.
When a unified diff report that both files have a blank line,
it shows this as a line containing just a space.
Sometimes that space can go missing (spaces at the end of a line
are like that).
So if we find a completely empty line, treat it like a line
containing just a space.
Some compilers complain that this might be used
uninitialised. They are wrong as it is only used when 'found' is
non-zero, and it is always set before found is set, but
as I like -Werror, I need to handle bad warnings too.
NeilBrown [Wed, 16 Oct 2013 01:58:51 +0000 (12:58 +1100)]
Makefile: make it easy to suppress the "-D" flag to "install".
Some versions of "install" do not support '-D', and it isn't needed
when installing to default location.
So all
make INSTALL=install
to suppress the -D
Reported-by: Christian Sonne Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 27 Aug 2013 23:35:14 +0000 (09:35 +1000)]
Makefile: suppress error messages from 'git'.
If you have a clone of the 'wiggle' git tree but with no
tags, then "git describe HEAD" will complain
fatal: No names found, cannot describe anything.
As it produces no output, the compiled in default will be used
so this is just an unnecessary message. So send it to /dev/null.
Reported-by: Stephen Cameron @ G+ Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 23 Aug 2013 04:24:14 +0000 (14:24 +1000)]
Hack to improve view of ignore-blank conflicts.
cd tests/contrib/abstract
../../../wiggle -Bb orig new new2
select the "central" and type 'x' so it disappears and note
that the only remaining difference is that "computational"
has been deleted.
Without this patch you only see one '-' line and no '+' line.
However the result isn't perfect as
./wiggle -Bp demo.patch
visit the README file
page down to waht "You can use 'o' ..." as added.
And note that there is a '-' blank line and a '+' blank line.
These are unwanted and added by this patch.
NeilBrown [Wed, 21 Aug 2013 01:40:59 +0000 (11:40 +1000)]
Preserve per-hunk "comment" field.
Each diff/patch hunk can have a comment, often a function
name extract with "diff -p".
Preserver that comment and display it when we print hunk headers,
particularly in the browser.