8
\$\begingroup\$

I've taken a crack at implementingSimple Diff in Rebol (versions 2 and 3). Simple Diff works by finding the longest common sequence in two series, then recursively applies itself either side of this sequence until there is nothing further in common. It returns a block containing additions (+), omissions (-) and matches (=). Example:

>> diff "The kuick brown fix." "The quick brown fox."== [= "The " - "k" + "q" = "uick brown f" - "i" + "o" = "x."]

Originally this function was designed to work only on blocks, but since comparisons are done using the series API, I've permitted the use of strings and it seems to work.

I'm looking for feedback on:

  • The algorithm: is there a more efficient way to build the index (although the index is only as large as the number of unique components of the series), and to perform the comparisons?

  • The result: is the result block in a useful form?

  • Any bugs?

Note: the use of FUNC (not FUNCTION) is deliberate in development. This function is currently designed for use in both Rebol 2 and Rebol 3.

Simple Diff in Rebol

Rebol [    Title: "SimpleDiff"    Date: 22-May-2014    Author: "Christopher Ross-Gill"    Type: 'module    Name: 'simplediff    Exports: [diff]    Comment: {        Based on Simple Diff for Python, CoffeeScript v0.1        (C) Paul Butler 2008 <http://www.paulbutler.org/>        https://github.com/paulgb/simplediff    }]diff: func [    {        Find the differences between two blocks. Returns a block of pairs, where the first value        is in [+ - =] and represents an insertion, deletion, or no change for that list.        The second value of the pair is the element.    }    before [block! string!] after [block! string!]    /local items-before starts-before starts-after run this-run tests limit][    assert [equal? type? before type? after]    ; Build a block with elements from 'before as keys, and    ; each position starting with each element as values.    items-before: make block! 100 ; arbitrary block memory allocation    forall before [        append/only any [            select/case items-before first before            last repend items-before [first before make block! 100]        ] before    ]    ; Find the largest substring common to before and after    forall after [        if tests: select/case items-before first after [            limit: length? after            run: any [run 0]            foreach before tests [                repeat offset min limit length? before [                    this-run: :offset                    unless before/:offset == after/:offset [                        this-run: offset - 1                        break                    ]                ]                if this-run > run [                    run: :this-run                    starts-before: :before                    starts-after: :after                ]            ]        ]    ]    collect [        either run [            ; The common substring is considered to have no change, and we recurse            ; on the text before and after the substring            keep diff copy/part before starts-before copy/part after starts-after            keep reduce ['= copy/part starts-after run]            keep diff copy skip starts-before run copy skip starts-after run        ][            ; Otherwise if no common substring is found, assume that an            ; insert and delete has taken place            unless tail? before [keep reduce ['- before]]            unless tail? after [keep reduce ['+ after]]        ]    ]]
askedJun 17, 2014 at 18:33
rgchris's user avatar
\$\endgroup\$
3
  • \$\begingroup\$I noticediff [[=] [+] [-]] [[-] [+] [=]] produces[- [[=] [+] [-]] + [[-] [+] [=]]]. Offhand I can't tell if[- [[=] [+]] = [[-]] + [[+] [=]]] would be more useful...? It would seem to paralleldiff [= + -] [- + =] producing[- [= +] = [-] + [+ =]].(I guess this example would be less confusing if I didn't use +, -, and =...but I was just testing with them to see what happened. :-P)\$\endgroup\$CommentedJun 17, 2014 at 19:06
  • \$\begingroup\$@HostileFork Yikes!\$\endgroup\$CommentedJun 17, 2014 at 19:12
  • \$\begingroup\$@HostileFork I'd need to look to figure out why it's different—I'm not certain it should be.\$\endgroup\$CommentedJun 17, 2014 at 19:17

0

You mustlog in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.