- Notifications
You must be signed in to change notification settings - Fork5
Rapid fuzzy string matching in Rust using various string metrics
License
Apache-2.0, MIT licenses found
Licenses found
rapidfuzz/rapidfuzz-rs
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Description •Installation •Usage •License
RapidFuzz is a general purpose string matching library with implementationsfor Rust, C++ and Python.
- Diverse String Metrics: Offers a variety of string metricsto suit different use cases. These range from the Levenshteindistance for edit-based comparisons to the Jaro-Winkler similarity formore nuanced similarity assessments.
- Optimized for Speed: The library is designed with performance in mind.Each implementation is carefully designed to ensure optimal performance,making it suitable for the analysis of large datasets.
- Easy to use: The API is designed to be simple to use, while still givingthe implementation room for optimization.
The installation is as simple as:
$cargo add rapidfuzzThe following examples show the usage with the Levenshtein distance. Other metricscan be found in thefuzz anddistance modules.
use rapidfuzz::distance::levenshtein;// Perform a simple comparision using he levenshtein distanceassert_eq!(3, levenshtein::distance("kitten".chars(),"sitting".chars()));// If you are sure the input strings are ASCII only it's usually faster to operate on bytesassert_eq!(3, levenshtein::distance("kitten".bytes(),"sitting".bytes()));// You can provide a score_cutoff value to filter out strings with distance that is worse than// the score_cutoffassert_eq!(None, levenshtein::distance_with_args("kitten".chars(),"sitting".chars(),&levenshtein::Args::default().score_cutoff(2)));// You can provide a score_hint to tell the implementation about the expected score.// This can be used to select a more performant implementation internally, but might cause// a slowdown in cases where the distance is actually worse than the score_hintassert_eq!(3, levenshtein::distance_with_args("kitten".chars(),"sitting".chars(),&levenshtein::Args::default().score_hint(2)));// When comparing a single string to multiple strings you can use the// provided `BatchComparators`. These can cache part of the calculation// which can provide significant speedupslet scorer = levenshtein::BatchComparator::new("kitten".chars());assert_eq!(3, scorer.distance("sitting".chars()));assert_eq!(0, scorer.distance("kitten".chars()));
Licensed under either ofApache License, Version2.0 orMIT License at your option.
Unless you explicitly state otherwise, any contribution intentionally submittedfor inclusion in RapidFuzz by you, as defined in the Apache-2.0 license, shall bedual licensed as above, without any additional terms or conditions.
About
Rapid fuzzy string matching in Rust using various string metrics
Resources
License
Apache-2.0, MIT licenses found
Licenses found
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.