Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork209
Description
Since version 4.10 of the library, diffs are recognizing less similarity between texts.
In the attached program, under version 4.9, the library correctly recognizes that there is only a 5-character difference between the texts (3 letters + 2 whitespace characters). Under version 4.10, the library reports that a large block of identical text has been deleted and then added.
4.9 output:
1 EQUAL apple1 apple1 2 EQUAL apple2 apple2 3 EQUAL apple3 apple3 4 CHANGE A man named Frankenstein==oldCHANGE==> abc <==old==to Switzerland for cookies! A man named Frankenstein 5 CHANGE ==newCHANGE==>xyz<==new== 6 CHANGE to Switzerland for cookies! 7 EQUAL banana1 banana1 8 EQUAL banana2 banana2 9 EQUAL banana3 banana34.10 output:
1 EQUAL apple1 apple1 2 EQUAL apple2 apple2 3 EQUAL apple3 apple3 4 CHANGE A man named Frankenstein==oldDELETE==> abc to Switzerland for cookies!<==old== A man named Frankenstein 5 INSERT ==newINSERT==>xyz<==new== 6 INSERT ==newINSERT==>to Switzerland for cookies!<==new== 7 EQUAL banana1 banana1 8 EQUAL banana2 banana2 9 EQUAL banana3 banana3Admittedly, there are aspects of the 4.10 output that are improved over the 4.9 output. For example, the fact that line 6 of the 4.9 output is indicated as a CHANGE, but there is no oldLine text and no changes in the newLine text can be confusing. However, the sacrifice in accuracy in 4.10 is far less desirable. In 4.9, the line 6 difference is indeed a change, but it's almost like a new tag is needed to indicate a group (?) change to make it clear that the change is a continuation of the line 4 difference.