Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Optimizations to the dictionary comparison strategy#51

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
ESultanik merged 12 commits intomasterfromdict-comparison-optimizations
May 16, 2022

Conversation

ESultanik
Copy link
Collaborator

Take these two JSON files as an example:

$cat f1.json{    "foo": [1, 2, 3],    "oof": [1, "two", 3]}$cat f2.json{    "bar": [1, 2, 3],    "foo": [1, "two", 3]}

By default Graphtage used to try all possible matchings between dictionary key/value pairs; comparinggraphtage f1.json f2.json would result in the "foo" key being replaced by "bar" and the "f" in the "oof" key being moved to the front of the string.

This sort of matching is polynomial time in the size of the input, but often is still intractable for large files. Therefore, Graphtage had an option,--no-key-edits or-k that would prevent two dictionary key/value pairs from being compared to each other unless their keys were identical.graphtage -k f1.json f2.json would have resulted in the2 being replaced by"two", the entire "oof" key/value pair being removed, and the entire "foo" key/value pair being added.

This PR…

  1. generalizes these two options with a new--dict-strategy/-ds option which sets the strategy:match for the old default behavior andnone for the old--no-key-edits behavior. The--no-key-edits option still exists, but now is an alias to--dict-strategy none.
  2. adds a new strategy,--dict-strategy auto, which is now the default, that behaves exactly the same as thematch strategy, but in the event that two key/value pairs have then exact same key, then they are automatically matched.

graphtage --dict-strategy auto f1.json f2.json will now result in2 being replaces with"two",oof being replaced bybar, and"two" being replaced by2.

@ESultanikESultanik self-assigned thisMay 12, 2022
@ESultanikESultanik added the enhancementNew feature or request labelMay 12, 2022
@ESultanikESultanik merged commit73639f6 intomasterMay 16, 2022
@ESultanikESultanik deleted the dict-comparison-optimizations branchMay 16, 2022 16:32
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees

@ESultanikESultanik

Labels
enhancementNew feature or request
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

1 participant
@ESultanik

[8]ページ先頭

©2009-2025 Movatter.jp