Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Line normalization before diff #219

Open
@epictecch

Description

@epictecch

Description

TheDiffRowGenerator class offers thelineNormalizer property. By default, it is used to replace< and> by their escaped versions&lt; and&gt;.

ThelineNormalizer is applied to the input texts before the diff is calculated. While I see this is as a useful feature, in case of the default settings it might be surprising that the resulting text might not have HTML escaping anymore:

finalvargenerator =DiffRowGenerator.create()//        .mergeOriginalRevised(true)//        .showInlineDiffs(true)//        .inlineDiffByWord(true)//        .build();finalvarrows =generator.generateDiffRows(List.of("hello <world>"),List.of("bye >world<"));finalvarresultingText =rows.stream()//        .map(DiffRow::getOldLine)//        .collect(Collectors.joining(StringUtils.LF));

The resulting text is

<span>hello</span><span>bye</span> &<span>lt</span><span>gt</span>;world&<span>gt</span><span>lt</span>;

Note the part & is considered as an equal text part because both replacements&lt; and&gt; start with an ampersand. This resulting text is therefore no valid HTML anymore.

In order for this behaviour to be a problem, the following conditions must all be true:

  1. TheinlineDiffByWord must be used
  2. The defaultlineNormalizer must be used
  3. The two provided texts must differ at a position which starts with a character that is replaced by thelineNormalizer
  4. A release >= 4.15 must be used.

Workaround
Override thelineNormalizer. E.g., by using theSPLIT_BY_WORD_PATTERN of release 4.12, in whichthe ampersand was not considered a character that splits words.

Solution approaches
IMHO, theSPLIT_BY_WORD_PATTERN of release 4.15+ is fine and I do not consider it to be the problem.

The library could offer one of the following features:

  1. a parameter which defines when the 'lineNormalizer' should be applied (before diff-ing or after)
  2. a second type of line-normalizer that is applied after diff-ing
  3. an option to have the library apply theprocessDiffs function to non-diffs as well

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp