Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

R wrapper for Google's diff-match-patch library

License

NotificationsYou must be signed in to change notification settings

rundel/diffmatchpatch

Repository files navigation

R-CMD-checkCRAN status

diffmatchpatch provides an R wrapper for theSTLvariant ofGoogle’sdiff-match-patchlibrary.

Installation

Currently only the development version of this library is available andcan be installed directly fromGitHub with:

# install.packages("devtools")devtools::install_github("rundel/diffmatchpatch")

Examples

All of the following examples are derived from Neil Fraser’s demoslinked from Google’sdiffmatchpatchrepo:diff,match,patch.

These examples are all run using the default options provided by thelibrary.

library(diffmatchpatch)dmp_options()#> $diff_timeout#> [1] 1#>#> $diff_edit_cost#> [1] 4#>#> $match_threshold#> [1] 0.5#>#> $match_distance#> [1] 1000#>#> $patch_delete_threshold#> [1] 0.5#>#> $patch_margin#> [1] 4#>#> $match_max_bits#> [1] 32

diff

These functions compare two text strings and efficiently return a listof (character level) differences. Note that the following text indicatesdeletions with-|text|- and additions with+|text|+ - withinenvironments that support ANSI escape strings (e.g. RStudio) these willbe colored red and green respectively.

src="I am the very model of a modern Major-General,I've information vegetable, animal, and mineral,I know the kings of England, and I quote the fights historical,From Marathon to Waterloo, in order categorical."dest="I am the very model of a cartoon individual,My animation's comical, unusual, and whimsical,I'm quite adept at funny gags, comedic theory I have read,From wicked puns and stupid jokes to anvils that drop on your head."diff_make(src,dest)#> I am the very model of a -|modern Major-General,#> I've information vegetable, animal, and mineral,#> I know the kings of England, and I quote the fights historical,#> From Marathon to Waterloo, in order categorical|-+|cartoon individual,#> My animation's comical, unusual, and whimsical,#> I'm quite adept at funny gags, comedic theory I have read,#> From wicked puns and stupid jokes to anvils that drop on your head|+.
diff_make(src,dest,cleanup="efficiency")#> I am the very model of a -|modern Major-Gener|-+|cartoon individu|+al,#> -|I've infor|-+|My ani|+mation-| vegetable, anim|-+|'s comical, unusu|+al, and -|miner|-+|whimsic|+al,#> I-| know the kings of England, and I quote|-+|'m quite adept at funny gags, comedic|+ the-| fights historical|-+|ory I have read|+,#> From -|Marathon|-+|wicked puns and stupid jokes|+ to -|Waterloo, in order categorical|-+|anvils that drop on your head|+.
diff_make(src,dest,cleanup="none")#> I am the very model of a -|m|-+|carto|+o-|der|-n -|Major-Ge|-+|i|+n-|er|-+|dividu|+al,#> -|I've|-+|My|+ -|i|-+|a|+n-|for|-+|i|+mation+|'s|+ -|veget|-+|comic|+a-|b|-l-|e|-, -|a|-+|u|+n-|im|-+|usu|+al, and +|whi|+m+|s|+i-|ner|-+|c|+al,#> I+|'m|+ -|know|-+|quite|+ +|adep|+t-|he|- -|kings|-+|at|+ -|o|-f-| E|-+|u|+n+|ny |+g-|l|-a-|nd|-+|gs|+, -|and I qu|-+|c|+o-|t|-+|m|+e+|dic|+ the+|ory|+ -|fig|-+|I |+h-|ts|-+|ave|+ -|histo|-r-|ic|-+|e|+a-|l|-+|d|+,#> From -|Mar|-+|wicked puns |+a+|nd s|+t-|h|-+|upid j|+o-|n|-+|kes|+ to -|W|-+|anvils th|+at-|e|-+| d|+r-|l|-o+|p |+o-|, i|-n +|y|+o-|rde|-+|u|+r -|cat|-+|h|+e-|goric|-a-|l|-+|d|+.

There are a number of helper functions provided for interacting with theresulting diff objects.

d= diff_make(src,dest,cleanup="efficiency")diff_levenshtein(d)#> [1] 167 diff_to_patch(d) %>% cat()#> @@ -22,70 +22,65 @@#>  f a#> -modern Major-Gener#> +cartoon individu#>  al,%0A#> -I've infor#> +My ani#>  mation#> - vegetable, anim#> +'s comical, unusu#>  al, and#>  mine#> @@ -79,125 +79,141 @@#>  and#> -miner#> +whimsic#>  al,%0AI#> - know the kings of England, and I quote#> +'m quite adept at funny gags, comedic#>   the#> - fights historical#> +ory I have read#>  ,%0AFrom#> -Marathon#> +wicked puns and stupid jokes#>   to#> -Waterloo, in order categorical#> +anvils that drop on your head#>  .diff_to_html(d) %>% cat()#> <span>I am the very model of a </span><del>modern Major-Gener</del><ins>cartoon individu</ins><span>al,&para;<br></span><del>I've infor</del><ins>My ani</ins><span>mation</span><del> vegetable, anim</del><ins>'s comical, unusu</ins><span>al, and </span><del>miner</del><ins>whimsic</ins><span>al,&para;<br>I</span><del> know the kings of England, and I quote</del><ins>'m quite adept at funny gags, comedic</ins><span> the</span><del> fights historical</del><ins>ory I have read</ins><span>,&para;<br>From </span><del>Marathon</del><ins>wicked puns and stupid jokes</ins><span> to </span><del>Waterloo, in order categorical</del><ins>anvils that drop on your head</ins><span>.</span>diff_text_source(d) %>% cat()#> I am the very model of a modern Major-General,#> I've information vegetable, animal, and mineral,#> I know the kings of England, and I quote the fights historical,#> From Marathon to Waterloo, in order categorical.diff_text_dest(d) %>% cat()#> I am the very model of a cartoon individual,#> My animation's comical, unusual, and whimsical,#> I'm quite adept at funny gags, comedic theory I have read,#> From wicked puns and stupid jokes to anvils that drop on your head.

match

match_find looks for a pattern within a larger text string using fuzzymatching. This means it can find a match even if the pattern containserrors and doesn’t exactly match what is found in the text. Candidatematches are scored based on a) the number of spelling differencesbetween the pattern and the text and b) the distance between thecandidate match and the expected location.

The returned value indicates the starting index (1-based indexing) ofthe matched text. The function returns -1 in cases where a match cannotbe found.

text="'Twas brillig, and the slithy tovesDid gyre and gimble in the wabe.All mimsy were the borogroves,And the mome raths outgrabe."match_find(text,pattern="slimy tools")#> [1] 24

There is an additionalloc argument that provides a suggested locationof where the pattern is expected to be found.

match_find(text,pattern="and",loc=0)#> [1] 16match_find(text,pattern="and",loc=30)#> [1] 16match_find(text,pattern="and",loc=50)#> [1] 46

Finally, the behavior of the fuzzy matching is controlled by thematch_distance andmatch_threshold options which can be set viadpm_options().

prev= dmp_options(match_threshold=0.3)match_find(text,pattern="slimy tools",loc=30)#> [1] -1dmp_options(prev)

patch

A patch is just a description of how to transform one text string intoanother (i.e. a fancy diff) and once constructed can be applied againsta new string (or the original). See thisarticle for moredetails on the specific formatting of a patch text string.

source="Hamlet: Do you see yonder cloud that's almost in shape of a camel?Polonius: By the mass, and 'tis like a camel, indeed.Hamlet: Methinks it is like a weasel.Polonius: It is backed like a weasel.Hamlet: Or like a whale?Polonius: Very like a whale.-- Shakespeare"dest="Hamlet: Do you see the cloud over there that's almost the shape of a camel?Polonius: By golly, it is like a camel, indeed.Hamlet: I think it looks like a weasel.Polonius: It is shaped like a weasel.Hamlet: Or like a whale?Polonius: It's totally like a whale.-- Shakespeare"(p= patch_make(source,dest))#> [1] "@@ -16,21 +16,29 @@\n see \n-yonder\n+the\n  cloud \n+over there \n that\n@@ -47,18 +47,19 @@\n  almost \n-in\n+the\n  shape o\n@@ -86,24 +86,18 @@\n  By \n-the mass, and 't\n+golly, it \n is l\n@@ -129,21 +129,23 @@\n et: \n-Me\n+I \n think\n-s\n  it \n-i\n+look\n s li\n@@ -177,12 +177,12 @@\n  is \n-back\n+shap\n ed l\n@@ -234,11 +234,19 @@\n us: \n-Ver\n+It's totall\n y li\n"
new_source="Kirk: Do you see yonder cloud that's almost in shape of a Klingon?Spock: By the mass, and 'tis like a Klingon, indeed.Kirk: Methinks it is like a Vulcan.Spock: It is backed like a Vulcan.Kirk: Or like a Romulan?Spock: Very like a Romulan.-- Trekkie"patch_apply(new_source,p) %>% cat()#> Kirk: Do you see the cloud over there that's almost the shape of a Klingon?#> Spock: By golly, it is like a Klingon, indeed.#> Kirk: I think it looks like a Vulcan.#> Spock: It is shaped like a Vulcan.#> Kirk: Or like a Romulan?#> Spock: It's totally like a Romulan.#> -- Trekkie

About

R wrapper for Google's diff-match-patch library

Resources

License

Stars

Watchers

Forks

Contributors2

  •  
  •  

Languages


[8]ページ先頭

©2009-2025 Movatter.jp