Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

🟣 Trie Data Structure interview questions and answers to help you prepare for your next data structures and algorithms interview in 2025.

NotificationsYou must be signed in to change notification settings

Devinterview-io/trie-data-structure-interview-questions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 

Repository files navigation

data-structures-and-algorithms

You can also find all 28 answers here 👉Devinterview.io - Trie Data Structure


1. What is aTrie?

Trie, often called aprefix tree and named after the termreTRIEval, is a tree-like data structure tailored for string operations. Instead of housing data within its nodes, a Trie utilizes its edges to encode information.

Key Features

  • Fast String Operations: Tries streamline operations such as search, insert, and delete for strings, offering an edge over arrays and hash tables in specific scenarios.
  • Efficient Prefix Searches: Their hierarchical structure facilitates easy traversal, making operations like autocomplete and prefix-based searches efficient.
  • Memory Compactness: While they can be more space-intensive than hash tables for smaller dictionaries, Tries can store large dictionaries compactly, suiting tasks like natural language processing, spell-checking, and IP routing.

Core Components

Trie Nodes

Theroot node doesn't carry a character value but acts as the starting point. Each subsequentchild node corresponds to a distinct character. Some nodes might denote the end of a string while simultaneously representing characters.

Node Pointers

Nodes in the Trie hold references to child nodes. For instance, when dealing with the English alphabet, an array of 26 elements can map characters to respective child nodes.

Visual Representation

Trie Data Structure

Complexity Analysis

  • Time Complexity:

    • Insertion, Search, and Deletion:$O(m)$ on average, where$m$ is the length of the word.
    • Prefix Search:$O(p)$, where$p$ is the number of words sharing the prefix.
  • Space Complexity: Roughly$O(n \cdot \sigma)$, where$n$ is the total word count and$\sigma$ represents the size of the alphabet.

Code Example: Trie

Here is the Python code:

classTrieNode:def__init__(self):self.children= {}self.is_word_end=FalseclassTrie:def__init__(self):self.root=TrieNode()definsert(self,word):node=self.rootforcharinword:ifcharnotinnode.children:node.children[char]=TrieNode()node=node.children[char]node.is_word_end=Truedefsearch(self,word):node=self.rootforcharinword:ifcharnotinnode.children:returnFalsenode=node.children[char]returnnode.is_word_end# Usage exampletrie=Trie()forwordin ["a","to","tea","ted","ten","in","inn"]:trie.insert(word)print(trie.search("tea"))# Trueprint(trie.search("tedd"))# False

2. What are the advantages ofTries overHash Tables?

In certain scenarios,Tries can outperformHash Tables due to their unique characteristics.

Benefits of Using Tries Over Hash Tables

  • Ordered Iteration: Unlike hash tables, tries inherently maintain key order. This makes them suitable for applications, such as dictionaries, where sorted output is needed. For instance, in Python, you can obtain sorted words from a trie-based dictionary using depth-first traversal.

  • Longest-Prefix Matching: Tries stand out when identifying the longest common prefix, a feature indispensable for tasks like text auto-completion or network routing. As an example, search engines can expedite query suggestions based on the longest matching prefix.

  • Consistent Insertion Speed: Tries offer a stable average-case insertion performance. This consistent behavior is attractive for latency-sensitive tasks. Contrarily, hash tables might necessitate occasional, time-intensive resizing operations.

  • Superior Performance with Small Keys: For small keys, such as integers or pointers, tries can be more efficient than hash tables. Their simpler tree structures remove the overhead related to intricate hash functions in hash tables.

Complexity Comparison

Time Complexity

  • Hash Tables: On average, they have$O(1)$ lookup time. However, this can deteriorate to$O(n)$ in worst-case scenarios.
  • Tries: They consistently exhibit an$O(k)$ lookup time, where$k$ represents the string's length. This predictability can be a boon for latency-critical tasks.

Space Complexity

  • Hash Tables: Typically occupy$O(N)$ space for$N$ elements.
  • Tries: Their space requirement stands at$O(ALPHABET_SIZE \times N)$, governed by the number of keys and the alphabet's size.

3. What areAdvantages andDisadvantages of aTrie?

Thetrie is a tree-like data structure optimized for string operations. While it presents distinct advantages, it also comes with specific drawbacks.

Advantages of the Trie

  1. Efficient String Operations: The trie achieves$O(m)$ time complexity for standard operations like insert, search, and delete, outpacing many traditional data structures.

  2. Prefix Matching: Designed for rapid prefix matching, the trie is especially effective in auto-completion and spell-check scenarios.

  3. Memory-Saving for Common Prefixes: The trie stores shared prefixes only once, which is advantageous for datasets with repeated string beginnings.

  4. Effective with Repetitive Data: It's ideal for identifying duplicates or repetitions in datasets.

  5. Lexicographical Sorting: The trie structure allows for easy iteration over data in lexicographic order.

  6. Flexible Alphabet Support: The trie isn't limited to alphanumeric data and can handle any consistent set of symbols, such as DNA sequences.

Disadvantages of the Trie

  1. Higher Memory Usage: Compared to some alternatives, the trie can consume more memory, especially for large datasets, due to its node structure.

  2. Insertion Speed: Inserting characters individually can make the insertion process slower than bulk data insertion methods.

  3. Cache Inefficiency: Non-contiguous memory storage of trie nodes might lead to more frequent cache misses.

  4. Implementation Complexity: Its recursive nature and pointer-based setup can make the trie more intricate to implement than simpler structures.


4. How do you handle the case sensitivity problem inTries?

Tries are commonly used in natural language processing to store and look up words efficiently. They arewell-suited for English, which is acase-sensitive language.

Case Sensitivity and Tries

  • Tries can treat characters in several ways:

    • Case-sensitive: Distinguishes between upper and lower case.
    • Case-insensitive: Ignores case distinctions.
    • Folded: Maps all characters to a specific case.
  • For case-sensitive operation, use tree branches to represent different cases. For each node, there'd be up to 26 children: 13 for lowercase letters and 13 for uppercase letters.

  • Modern keyboards can differ in the number of keystrokes required for common operations on case-insensitive tries. This influences the popularity of these systems.

  • The most widely used trie variant is not case-sensitive because it requires fewer memory and computational resources.

Handling Case Sensitivity

In thealphabet:

  • Num of Unique Characters:$26$ (English Standard 26, French 23, Turkish 29, etc.)

In a trie'snode structures:

  • Memory Requirement: One Boolean flag plus 26 pointers, demanding 26 times more memory than the simplest case-less variant.

Code Example: Case-Insensitive Trie with Python

Here is the Python code:

classTrieNode:def__init__(self):self.children= {}self.is_word=FalseclassCaseInsensitiveTrie:def__init__(self):self.root=TrieNode()definsert(self,word):current=self.rootword=word.lower()forcharinword:ifcharnotincurrent.children:current.children[char]=TrieNode()current=current.children[char]current.is_word=Truedefsearch(self,word):current=self.rootword=word.lower()forcharinword:ifcharnotincurrent.children:returnFalsecurrent=current.children[char]returncurrent.is_word

5. What are somePractical Applications of aTrie?

TheTrie data structure is effective for tasks related toexact string matching andtext auto-completion. Let's look at it's real-world applications.

Practical Applications

Text Input

  • Auto-Completion: Predict words as a user types, seen in search bars or messaging apps.
  • Spell Checking: Quickly identify misspelled words.
  • Recommendation Engines: Offer suggestions based on user input or previous choices.

Data Retrieval

  • String Filtering: Efficiently find exact matches, useful for profanity filtering or blacklists.
  • Contact Lookups: Search contact lists using names or numbers faster.

Data Analysis

  • Web Crawlers: Assist in web indexing and data extraction.
  • Pattern Matching: Used in algorithms like Aho-Corasick and Rabin-Karp for matching patterns, including in DNA sequences.

Code Example: Trie-Based Auto-Completion

Here is the Python code:

classTrieNode:def__init__(self):self.children= {}self.is_word_end=FalseclassAutoComplete:def__init__(self):self.root=TrieNode()definsert_word(self,word):node=self.rootforcharinword:ifcharnotinnode.children:node.children[char]=TrieNode()node=node.children[char]node.is_word_end=Truedefsearch_prefix(self,prefix):node=self.rootforcharinprefix:ifcharnotinnode.children:return []node=node.children[char]returnself._get_words_from_node(node,prefix)def_get_words_from_node(self,node,prefix):results= []ifnode.is_word_end:results.append(prefix)forchar,child_nodeinnode.children.items():results.extend(self._get_words_from_node(child_node,prefix+char))returnresults# Sample Usage:ac=AutoComplete()ac.insert_word("apple")ac.insert_word("app")ac.insert_word("application")print(ac.search_prefix("ap"))# ['app', 'apple', 'application']

6. Describe how aTrie can be used for implementing autocomplete functionality.

Trie, a tree-like data structure, is an optimal choice for buildingautocomplete systems. With each node representing a single character,Tries are especially efficient in dealing with text data.

Characteristics

  • Partial Matching: The search process returns results that start with the input characters.
  • Dynamic: The trie adapts as text changes.

Algorithm Steps for Autocomplete

  1. Trie Traversal: Starting from the root, navigate the trie to the node representing the last character in the input string.
  2. Subtree Collection: Using the node from step 1, collect all descendant nodes to achieve partial matching.
  3. List Generation: Convert the nodes from step 2 to strings, typically utilizing depth-first search.

Advantages

  • Speed: Autocompletion, even in large datasets, is quick.
  • Context Sensitivity: Users see suggestions that match the context of words they've already typed.
  • Duplication Handling: No duplicate suggestions are provided.

Code Example: Trie Structure

Here is the Python code:

classTrieNode:def__init__(self):self.children= {}self.is_word=FalseclassTrie:def__init__(self):self.root=TrieNode()definsert(self,word):node=self.rootforcharinword:ifcharnotinnode.children:node.children[char]=TrieNode()node=node.children[char]node.is_word=True# Other utility methods for trie operations such as delete.# Example Usaget=Trie()words= ["hello","world","hi","python"]forwordinwords:t.insert(word)

Code Example: Autocomplete Functionality Using Trie

Here is the Python code:

classTrieAutoComplete(Trie):defget_all(self,prefix):node=self.rootforcharinprefix:ifcharinnode.children:node=node.children[char]else:return []returnself._gather_words(node,prefix)def_gather_words(self,node,prefix):results= []ifnode.is_word:results.append(prefix)forchar,child_nodeinnode.children.items():results.extend(self._gather_words(child_node,prefix+char))returnresults# Example Usage for Autocompletiont=TrieAutoComplete()words= ["hello","world","hi","python"]forwordinwords:t.insert(word)print(t.get_all("h"))# Output: ["hello", "hi"]

In the given example, theget_all method uses the prefix to reach the relevant node in the Trie and then uses the private method_gather_words to collect all words that start with the given prefix through a depth-first search.

7. Explain the use of aTrie in IP Routing to match IP prefixes.

Tries are advanced data structures, specifically designed for efficient prefix matching. This quality makes them not only powerful but also indispensable in IP routing.

IPv4 IP Address Representation

An IPv4 address like192.168.1.1 is commonly represented inbinary using 32 bits, partitioned into four octets. Decimal-to-binary conversion yields:

  • 192: 11000000
  • 168: 10101000
  • 1: 00000001
  • 1: 00000001

Concatenating these octets gives the binary form: 11000000101010000000000100000001.

Trie Representation for IP Addresses

Tries are$N$-ary trees—each node can have up to$N$ children (commonly$N = 2$ for binary tries).

ABinary Trie is a specialized kind of trie where each node has either two or no children. This binary arrangement makes them ideal for matching binary IP addresses.

Starting from the root (or root node), you compare each bit of the IP address. A0 bit usually directs the traversal to the left (child node), while a1 bit leads to the right.

When you reach a leaf node, the series of0s and1s you encountered forms the longest matching prefix. Also, if a given node has no children, and it is a leaf node, the path to that node forms theIP Prefix.

Binary Trie Example for IP Addresses

Here is the visual representation of how a binary trie looks with three IP addresses.

  1. $1100 0000$ 0000 0000 0000 0000 0000 0001
  2. $1100 0000$ 1010 1000 0000 0000 0000 0001
  3. $1100 0000$ 10 000000 0000 0000 0000 0000

Longest-Prefix Matching Technique

Longest-prefix matching involves identifying the most specific IP address that matches a given destination address.

Practical uses of longest-prefix matching arise in various network tasks, such as:

  • IP address assignments in Network Address Translation (NAT).
  • VPN routing tables to find the most suitable forwarding entry in a routing table.

The trie's inherent design for binary strings and its navigational logic of reading incoming bits characterize the steps necessary to accomplish longest-prefix matching using a trie structure.

Binary Trie for Longest-Prefix Matching

  • Initialization: Start From Trienoderoot which contains all bits (32-bits long) of the keys. Both keys and Nodes (except leaves) can be thought of as 32-bit long strings, the root thus doesn't represent any key, just the unused bits.**.

  • Comparison: Proceed through the trie bybitwise AND of the current node and the next bit in the given IP address.1 -> Right,0 -> Left are the conditions the comparison affects.

  • Longest Matching Prefix is recorded as the valid IP prefix until the path reaches a leaf node.


8. How is aTrie used in textspell-checking and correction systems?

ATrie is a powerfultree-like data structure, especially suited for text operations such asspelling correction.

In aspell-checking context, the Trie helps to identify misspelled words by recognizing sequences of letters that do not form valid words.

Trie for Spelling

TheTrie's structure supports spelling-related tasks. Here's how:

  • Path Matching: Each node represents a letter. Traversing from the root down a collection of nodes can form a word. This setup aids in comparing input tokens with stored words.
  • Comprehensive Coverage: The Trie covers the entire pool of words in its structure. Therefore, it can verify any word against all known words, making it ideal for spell-checking.

Spelling Auto-Correction

The Trie employs techniques such aslevenstein distance andedit distances. These algorithms examine the relationship between words determined by word case, word length, and term frequency (TF-IDF).

Edit Distances and Typos

Edit distances, also known asLevenshtein distances, represent the number of single-edit steps required to convert one given word into another. Such edits can encompass:

  1. Substitute: Swapping one letter for another.
  2. Insert: Placing a new letter within the word.
  3. Delete: Removing a letter.
  4. Transpose: Reversing the positions of two adjoining letters.

Implementing these transformations and utilizing the Trie for efficient validation helps correct apparent typos or isolated spelling errors.

9. Name someTrie Implementation strategies.

Toimplement a trie, various strategies are available:

Array-based Implementation

In this approach, eachtrie node contains anarray where each index corresponds to a character of the alphabet.

For example,0 stands for 'a',1 for 'b', and so on. While simple to understand, this method canconsume more memory, especially if many of the array elements are unused.

Code Example: Array Implementation

Here is the Python code:

classTrieNode:def__init__(self):# Array for 26 letters of the alphabetself.children= [None]*26self.isEndOfWord=False# Initialize the root noderoot=TrieNode()

Linked List Approach

Here, eachnode corresponds to acharacter, and thechild attribute points to thehead of another linked list, representing the next level of characters.

This approach cansave memory when the number of child nodes is significantly less than the array size of the previous method.

Code Example: Linked List Implementation

Here is the Python code:

classTrieNode:def__init__(self,key=None,value=None,parent=None,prev=None):self.key=keyself.value=valueself.parent,self.prev,self.next=parent,prev,Noneself.child=None# Initialize the root noderoot=TrieNode()

Bitmap Strategy

This technique uses a combination of abitmap and anarray of pointers. The bitmap determines which characters are present at a given node. The array of pointers then has a size equal to the number of set bits in the bitmap.

While this strategy can significantlyreduce the memory footprint, it can add complexity to some operations.

Hybrid Techniques

Merging the strengths of the strategies mentioned above can yield a balanced solution. For instance, one could use abitmap to compress the children array in thearray-based approach or combine bitmaps with linked lists.

Best Practices for Modern Applications

  • Leverage Libraries: There are well-optimized, library-based trie structures that can be advantageous. Using established libraries can streamline the process and enhance efficiency.
  • Optimize Based on Use Case: The ideal trie configuration will vary based on its intended application.

10. Write the code to insert a word into aTrie.

Problem Statement

The task is to create the code forinserting a word into aTrie data structure.

Solution

In aTrie, each node represents a single character of a word. Starting from the root, a path along the nodes denotes a valid word.

Algorithm Steps

  1. Initialize acurrent node to the root of the Trie.
  2. For each character in the word, check if the current node has a child node corresponding to that character. If not, create a new node and add it as a child of the current node.
  3. Move the current node to the child node corresponding to the character and repeat step 2.
  4. After iterating through the word, mark the current node (which now represents the last character of the word) as anend of the word.

Complexity Analysis

  • Time Complexity:$O(L)$ where$L$ is the length of the word.
  • Space Complexity:$O(L)$ since in the worst case, we need to add$L$ new nodes, each for a different character in the word.

Implementation

Here is the Python code:

classTrieNode:def__init__(self):self.children= {}self.is_end_of_word=FalseclassTrie:def__init__(self):self.root=TrieNode()definsert(self,word):current=self.rootforcharinword:ifcharnotincurrent.children:current.children[char]=TrieNode()current=current.children[char]current.is_end_of_word=True# Usagetrie=Trie()words= ["apple","banana","app","ape","bat"]forwordinwords:trie.insert(word)

11. Implement aSearch function to determine if a word is included in aTrie.

Problem Statement

The goal is to implement asearch function that can determine whether a particular word is included in aTrie data structure.

Solution

The decision to include a word in aTrie structure relies on whether it exists in the dictionary represented by the Trie. In a Trie, the presence of a word is directly associated with the presence of an endmarker ($).

  • For Python and JavaScript: thesearch function recursively traverses the Trie.
  • For C++: an iterative approach is often preferred for performance reasons, especially when dealing with very large Tries.

Both methods should identify whether a given word is part of the Trie, hence serving as an efficient spellchecker or dictionary for a set of words.

Complexity Analysis

  • Time Complexity:

    • $O(m)$ where$m$ is the length of the key. Each character in the word can be found in its corresponding Trie level, which makes this operation linear.
  • Space Complexity:

    • For Python and JavaScript, the Space Complexity is$O(m)$ as well, where$m$ is the length of the key, accounting for the call stack during the recursive traversal.
    • For C++, the Space Complexity can be considered$O(1)$ as there are no additional data structures influencing it. However, if we include the space taken by critical operations during the search, it will still be$O(m)$ for a particular word.

Implementation

Here is the Python code:

classTrieNode:def__init__(self):self.children= {}self.is_end_word=FalseclassTrie:def__init__(self):self.root=TrieNode()definsert(self,word):node=self.rootforcharinword:ifcharnotinnode.children:node.children[char]=TrieNode()node=node.children[char]node.is_end_word=Truedefsearch(self,word):node=self.rootforcharinword:ifcharnotinnode.children:returnFalsenode=node.children[char]returnnode.is_end_word

12. Write a script toDelete a word from aTrie.

Problem Statement

Deleting a word in aTrie usually involves setting theis_end_of_word attribute of the last node of the word toFalse. This is necessary, even though the word may not be explicitly present. Consider, for instance, the words "a" and "and" — if "and" is deleted, "a" should remain a word.

Solution

In addition to settingis_end_of_word toFalse in the last node of the word, if any nodes becomeredundant (i.e., have no other children or are part of another word), they should be removed from the tree as well.

Algorithm Steps

  1. Start from the root node.
  2. For each character$c$ in the word:
    • If the child node for$c$ doesn't exist, the word isn't present in the trie. Return without making any changes.
    • Navigate to the child node for$c$.
  3. After reaching the last character of the word, which ends at node$node_end$:
    • If$node_end$ has no children, traverse back (from the end) and delete nodes with only one child, until the root or a node with multiple children is reached.
    • Set$node_end.is_end_of_word$ toFalse.

Complexity Analysis

  • Time Complexity:$O(m)$ where$m$ is the length of the word. We may have to traverse back up the tree to delete nodes, but the maximum number of backtracks is limited by the number of nodes in the trie.
  • Space Complexity:$O(1)$

Implementation

Here is the Python code:

classTrieNode:def__init__(self):self.children= {}self.is_end_of_word=FalseclassTrie:def__init__(self):self.root=TrieNode()definsert(self,word):node=self.rootforcharinword:ifcharnotinnode.children:node.children[char]=TrieNode()node=node.children[char]node.is_end_of_word=Truedefsearch(self,word):node=self.rootforcharinword:ifcharnotinnode.children:returnFalsenode=node.children[char]returnnode.is_end_of_worddefstarts_with(self,prefix):node=self.rootforcharinprefix:ifcharnotinnode.children:returnFalsenode=node.children[char]returnTruedefdelete(self,word):ifnotself.root.children:return# Trie is emptyself._delete_helper(self.root,word,0)def_delete_helper(self,node,word,index):ifindex==len(word):ifnode.is_end_of_word:node.is_end_of_word=Falseifnotnode.children:delnode# Delete redundant nodesreturnnotnode.children# True if node has no childrenreturnFalsechar=word[index]ifcharnotinnode.children:returnFalseshould_delete_current_node=self._delete_helper(node.children[char],word,index+1)ifshould_delete_current_node:delnode.children[char]returnnotnode.children# True if node has no other children after deletionreturnFalse# Exampletrie=Trie()words= ["hello","world","python","programming"]forwordinwords:trie.insert(word)trie.delete("python")print(trie.search("python"))# Output: False

13. CompareTrie vs.Binary Search Tree.

WhileTries are specialized for tasks involving strings and are especially efficient for datasets with shared prefixes,BSTs are versatile, general-purpose trees that can store any ordered data.

Time Complexity

Look-up

  • Trie: This is determined by the length of the word/key being looked up. Hence, the time complexity is$O(m)$, where$m$ is the length of the key.
  • BST: Efficient look-ups in balanced BSTs are$O(\log n)$, but if the BST becomes skewed, it degrades to$O(n)$.

Insertion and Deletion

  • Trie: Insertion and deletion are typically$O(m)$, with$m$ being the key's length.
  • BST: Insertion and deletion are$O(\log n)$ in a balanced tree. However, in the worst-case scenario (unbalanced tree), these operations can take$O(n)$ time.

Space Complexity

  • Trie: Often more space-efficient, especially when dealing with datasets having short keys with common prefixes. It can save considerable space by sharing common prefix nodes.
  • BST: Every node in the BST requires storage for its key and pointers to its two children. This fixed overhead can make BSTs less space-efficient than tries, especially for large datasets.

Specialized Operations

  • Trie: Excels at operations like longest-prefix matching, making it an ideal choice for applications such as autocompletion, IP routing, and more.
  • BST: While not specialized like tries, BSTs are more general-purpose and can handle a wider range of tasks.

Maintenance and Balance

  • Trie: Inherently balanced, making them relatively low-maintenance. This ensures consistent performance without the need for additional balancing algorithms.
  • BST: To maintain efficient operation, BSTs often require balancing using specific algorithms or tree structures like AVL or Red-Black trees. Without periodic balancing, the tree could become skewed, leading to suboptimal performance.

14. How to choose between aHash Table and aTrie?

BothHash Tables andTries are data structures used for efficient data storage and retrieval.

Whilehash tables offer fast lookups and unordered data storage,tries are optimized for ordered tasks and text-related functions.

Hash Tables: Quick Lookups

  • Definition: Uses a hash function to map keys to unique values for fast retrieval.
  • Best For: Unordered data and quick lookups.
  • Pros: Fast O(1) lookups, memory efficiency.
  • Cons: Lack of ordering, potential for collisions.

Tries: Text-Focused Storage

  • Definition: Organizes keys based on common prefixes and is often used for text-related tasks like auto-completion.
  • Best For: Maintaining order and tasks involving text.
  • Pros: Efficient prefix matching, ordered retrieval.
  • Cons: Memory-intensive.

Code Example: Word Lookup

Here is the Python code:

# Hash Tableword_dict= {"hello":"English greeting","hola":"Spanish greeting"}# TriefromcollectionsimportdefaultdictclassTrieNode:def__init__(self):self.children=defaultdict(TrieNode)self.is_word=Falseroot=TrieNode()words= ["hello","hola"]forwordinwords:node=rootforcharinword:node=node.children[char]node.is_word=True

15. What is aSuffix Trie, and how does it differ from a standardTrie?

ASuffix Trie extends the functionality of a regular trie to capture allsuffices of a string. Both Tries are structures well-suited for substrings and searching in text.

Core Approach

Rather than simply storing individual letters, aSuffix Trie captures complete substrings from the input word. Each node in theSuffix Trie represents a substring, starting from the root that denotes the complete word and ending with terminal nodes signifying various suffixes.

Example: "BANANA"

In a regular trie, word "BANANA" would be represented as follows:

       root        |        B       / \      A   R     / \    N   E   /  A

In aSuffix Trie for "BANANA," the main root still represents the complete word. However, it has specialized edges, known as suffix links or paths, leading to nodes that correspond tosuffixes of the word:

       root        /  \      A..  ..NA     /       \    NA        B..A   /             \  ANA             NA /                 \NA                 A \               /  A             NA

Structural Differences

  • Leaf Nodes: In a regular trie, leaves often serve as indicators for complete words. In aSuffix Trie, leaf nodes invariably denote the end of a suffix.

  • Internal Nodes: These nodes inSuffix Tries hold multiple incoming edges, including those from the root, whereas in regular tries, they typically have a one-to-one edge-to-substring relationship.

  • Terminating Characters: Regular tries often rely on designated characters like "end of word" flag (e.g.,'$') to mark the end of words.Suffix Tries, in contrast, determine word boundaries based on incorporating all suffixes of the input.

Primary Utility

  • Trie: Optimized for rapid prefix-based search, common in tasks such as autocompletion.
  • Suffix Trie: Primarily used to facilitate swift pattern matching in texts, particularly in bioinformatics and data compression.

Code Example: Regular Trie and Suffix Trie

Here is the Python code:

# Regular TrieclassNode:def__init__(self):self.children= {}self.is_end=Falsedefinsert_word(root,word):node=rootforcharinword:ifcharnotinnode.children:node.children[char]=Node()node=node.children[char]node.is_end=True# Suffix TrieclassSuffixNode:def__init__(self):self.children= {}self.suffix_link=Nonedefinsert_suffix(root,suffix):node=rootforcharinsuffix:ifcharnotinnode.children:node.children[char]=SuffixNode()node=node.children[char]# Establishing the suffix link is algorithmically more complex

Explore all 28 answers here 👉Devinterview.io - Trie Data Structure


data-structures-and-algorithms

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp