Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Trie implementation in Elixir

License

NotificationsYou must be signed in to change notification settings

Rob-bie/retrieval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Retrieval is a tradional trie implementation in pure Elixir that supports pattern based lookup and a varietyof other functionality. Documentation can be foundHERE.

Installation

Add Retrieval to yourmix.exs dependencies:

defdepsdo[{:retrieval,"~> 0.9.1"}]end

Usage

Creating a new trie

There's three ways to create a new trie struct. Provide zero arguments to create an empty trie,provide a binary key or list of binary keys to create a trie with nodes.

Retrieval.new%Retrieval.Trie{trie:...}Retrieval.new("example")%Retrieval.Trie{trie:...}Retrieval.new(~w/this is an example/)%Retrieval.Trie{trie:...}

For the rest of our examples we are going to assume we have created the following trie:

trie=Retrieval.new(~w/apple apply ape bed between betray cat noon hot                        warm winter boob smash crush under above people                        negative poison place out divide zebra extended/)
Insertion

Just like creating a new trie you can either provide a single binary key or a list of binary keys forinsertion.

Retrieval.insert(trie,"elephant")|>Retrieval.insert("banana")|>Retrieval.insert(~w/cheese orange alchemy/)
Key membership

Checking if a trie contains a binary key is quick and easy.

Retrieval.contains?(trie,"apple")trueRetrieval.contains?(trie,"extended")trueRetrieval.contains?(trie,"doodle")false
Prefix lookup

Looking up all binary keys that begin with a certain prefix is very common.

Retrieval.prefix(trie,"a")["above","ape","apple","apply"]Retrieval.prefix(trie,"app")["apple","apply"]Retrieval.prefix(trie,"be")["bed","betray","between"]Retrieval.prefix(trie,"th")[]
Pattern lookup

Retrieval supports a variety of patterns that can be arbitrarily combined. Erroneous patternsreturn a tuple of the form{:error, reason}.

Wildcard

The wildcard pattern* simply matches onany character.

# Three letter word# "*" - Matches any character# "p" - Matches "p"# "e" - Matches "e"Retrieval.pattern(trie,"*pe")["ape"]# Five letter word# "a" - Matches "a"# "*" - Matches any character# "*" - Matches any character# "*" - Matches any character# "y" - Matches "y"Retrieval.pattern(trie,"a***y")["apply"]# Three letter word# "*" - Matches any character# "*" - Matches any character# "*" - Matches any characterRetrieval.pattern(trie,"***")["ape","bed","cat","hot","out"]# "*" - Matches any character# "n" - Matches "n"# Two letter word that ends with "n"Retrieval.pattern(trie,"*n")[]
Inclusion group

The inclusion group pattern[...] matches on any characterinside of the brackets.

# Three letter word# "[ab]" - Matches "a" or "b"# "*"    - Matches any character# "*"    - Matches any characterRetrieval.pattern(trie,"[ab]**")["ape","bed"]# Three letter word# "[ab]" - Matches "a" or "b"# "*"    - Matches any character# "d"    - Matches "d"Retrieval.pattern(trie,"[ab]*d")["bed"]# Five letter word# "[abz]" - Matches "a", "b", or, "z"# "[eb]"  - Matches "e" or "b"# "*"     - Matches any character# "*"     - Matches any character# "[ea]"  - Matches "e" or "a"Retrieval.pattern(trie,"[abz][eb]**[ea]")["above","zebra"]# Four letter word# "[yxzo]" - Matches "y", "x", "z", or "o"# "*"      - Matches any character# "*"      - Matches any character# "*"      - Matches any characterRetrieval.pattern(trie,"[yxzo]***")[]# Oh no, a bad patternRetrieval.pattern(trie,"[abc][ezf"){:error,"Dangling group (inclusion) starting at column 6, expecting ]"}
Exclusion group

The exclusion group pattern[^...] matches on any characternot inside of the brackets.

# Five letter word# "[abz]" - Matches "a", "b", or, "z"# "[^eb]" - Matches any character but "e" or "b"# "*"     - Matches any character# "*"     - Matches any character# "[ea]"  - Matches "e" or "a"Retrieval.pattern(trie,"[abz][^eb]**[ea]")["apple"]# Six letter word# "[pd]"  - Matches "p" or "d"# "*"     - Matches any character# "*"     - Matches any character# "*"     - Matches any character# "[^od]" - Matches any character but "o" or "d"# "*"     - Matches any characterRetrieval.pattern(trie,"[pd]***[^od]*")["people"]# Three letter word# "[^abc]" - Matches any character but "a", "b", or, "c"# "*"      - Matches any character# "*"      - Matches any characterRetrieval.pattern(trie,"[^abc]**")["hot","out"]# Oh no, a bad patternRetrieval.pattern(trie,"[^abc"){:error,"Dangling group (exclusion) starting at column 1, expecting ]"}
Capture group

The default capture group pattern{name} must have a name and matches on any character. A name can be any combination of charactersbut it is convention to start with the start with the name1 and increment the name of the next capture group by one. Subsequent usage ofthe named capture group must match the character captured by the first capture group. In other words, capturegroups are useful when looking for binaries with the same character in different positions. Capture groupscan be combined with aninclusion group ({name[...]}) or anexclusion group ({name[^...]}). It is mandatory that the capture group'sname is the first thing that appears after the opening curly brace and the optional inclusion or exclusion group must the last thing that appears beforethe closing curly brace. In the event that a pattern is provided that doesn't follow these rules an error will be returned.

# Four letter word# "{1}" - Creates a new capture group named "1", matches any character# "{2}" - Creates a new capture group named "2", matches any character# "{2}" - Matches any character captured by initial capture group "{2}"# "{1}" - Matches any character captured by initial capture group "{1}"# You may have recognized that this pattern is for searching for four letter palindromes, neat!Retrieval.pattern(trie,"{1}{2}{2}{1}")["boob","noon"]# Four letter word# "{1[^nm]}" - Creates a new capture group named "1", matches any character but "n" or "m"# "{2[oe]}"  - Creates a new capture group named "2", matches "o" or "e"# "{2}"      - Matches any character captured by initial capture group "{2}"# "{1}"      - Matches any character captured by initial capture group "{1}"# Notice that it no longer accepts "noon" because of the exclusion of "n" in capture group "{1}"Retrieval.pattern(trie,"{1[^nm]}{2[oe]}{2}{1}")["boob"]# Eight letter word# "{aa}" - Creates a new capture group named "aa", matches any character# "*"    - Matches any character# "*"    - Matches any character# "{aa}" - Matches any character captured by initial capture group "{aa}"# "*"    - Matches any character# "{b}"  - Creates a new capture group named "b", matches any character# "{aa}" - Matches any character captured by initial capture group "{aa}"# "{b}"  - Matches any character captured by initial capture group "{b}"# In English, find a size eight binary that has the same character in the first, fourth, and seventh position# Additionally, it must have the same character in the sixth and eighth position# The rest are wildcards, shove anything that fits in betweenRetrieval.pattern(trie,"{aa}**{aa}*{b}{aa}{b}")["extended"]# Oh no, a bad patternRetrieval.pattern(trie,"{}**{aa}*{b}{aa}{b}"){:error,"Unnamed capture starting at column 1, capture cannot be empty"}# Oh no, a bad patternRetrieval.pattern(trie,"{aa}**{aa}*{b}{aa}{b"){:error,"Dangling group (capture) starting at column 19, expecting }"}# Oh no, a bad patternRetrieval.pattern(trie,"{aa[^abc]a}**{aa}*{b}{aa}{b}"){:error,"Group (exclusion) must in the tail position of capture starting at column 4"}
Escaping pattern symbols

You may have been wondering if it's possible to match on pattern symbols to find binaries that contain the wildcard*,the exclusion carot^, and others. The answer is yes, the pattern parser allows you to escape otherwisereserved symbols to be used directly in patterns.

The following symbols can be escaped:

SymbolEscaped symbol
*\\*
^\\^
[\\[
]\\]
{\\{
}\\}

And in action:

# Three letter word# "\\*" - Matches "*"# "*"   - Matches any character# "*"   - Matches any characterRetrieval.new(~w/*ab [x*/)|>Retrieval.pattern("\\***")["*ab"]# Three letter word# "[\\*\\[]" - Matches "*" or "["# "*"        - Matches any character# "*"        - Matches any characterRetrieval.new(~w/*ab [x*/)|>Retrieval.pattern("[\\*\\[]**")["*ab","[x*"]# Oh no, a bad patternRetrieval.new(~w/*ab [x*/)|>Retrieval.pattern("[*]**"){:error,"Unescaped symbol * at column 2"}

Rambling

Will ramble later!

License

This work is free. You can redistribute it and/or modify it under theterms of the MIT License. See the LICENSE file for more details.

About

Trie implementation in Elixir

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp