- Notifications
You must be signed in to change notification settings - Fork2
🟣 Trie Data Structure interview questions and answers to help you prepare for your next data structures and algorithms interview in 2025.
Devinterview-io/trie-data-structure-interview-questions
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
You can also find all 28 answers here 👉Devinterview.io - Trie Data Structure
Trie, often called aprefix tree and named after the termreTRIEval, is a tree-like data structure tailored for string operations. Instead of housing data within its nodes, a Trie utilizes its edges to encode information.
- Fast String Operations: Tries streamline operations such as search, insert, and delete for strings, offering an edge over arrays and hash tables in specific scenarios.
- Efficient Prefix Searches: Their hierarchical structure facilitates easy traversal, making operations like autocomplete and prefix-based searches efficient.
- Memory Compactness: While they can be more space-intensive than hash tables for smaller dictionaries, Tries can store large dictionaries compactly, suiting tasks like natural language processing, spell-checking, and IP routing.
Theroot node doesn't carry a character value but acts as the starting point. Each subsequentchild node corresponds to a distinct character. Some nodes might denote the end of a string while simultaneously representing characters.
Nodes in the Trie hold references to child nodes. For instance, when dealing with the English alphabet, an array of 26 elements can map characters to respective child nodes.
Time Complexity:
- Insertion, Search, and Deletion:
$O(m)$ on average, where$m$ is the length of the word. - Prefix Search:
$O(p)$ , where$p$ is the number of words sharing the prefix.
- Insertion, Search, and Deletion:
Space Complexity: Roughly
$O(n \cdot \sigma)$ , where$n$ is the total word count and$\sigma$ represents the size of the alphabet.
Here is the Python code:
classTrieNode:def__init__(self):self.children= {}self.is_word_end=FalseclassTrie:def__init__(self):self.root=TrieNode()definsert(self,word):node=self.rootforcharinword:ifcharnotinnode.children:node.children[char]=TrieNode()node=node.children[char]node.is_word_end=Truedefsearch(self,word):node=self.rootforcharinword:ifcharnotinnode.children:returnFalsenode=node.children[char]returnnode.is_word_end# Usage exampletrie=Trie()forwordin ["a","to","tea","ted","ten","in","inn"]:trie.insert(word)print(trie.search("tea"))# Trueprint(trie.search("tedd"))# False
In certain scenarios,Tries can outperformHash Tables due to their unique characteristics.
Ordered Iteration: Unlike hash tables, tries inherently maintain key order. This makes them suitable for applications, such as dictionaries, where sorted output is needed. For instance, in Python, you can obtain sorted words from a trie-based dictionary using depth-first traversal.
Longest-Prefix Matching: Tries stand out when identifying the longest common prefix, a feature indispensable for tasks like text auto-completion or network routing. As an example, search engines can expedite query suggestions based on the longest matching prefix.
Consistent Insertion Speed: Tries offer a stable average-case insertion performance. This consistent behavior is attractive for latency-sensitive tasks. Contrarily, hash tables might necessitate occasional, time-intensive resizing operations.
Superior Performance with Small Keys: For small keys, such as integers or pointers, tries can be more efficient than hash tables. Their simpler tree structures remove the overhead related to intricate hash functions in hash tables.
- Hash Tables: On average, they have
$O(1)$ lookup time. However, this can deteriorate to$O(n)$ in worst-case scenarios. - Tries: They consistently exhibit an
$O(k)$ lookup time, where$k$ represents the string's length. This predictability can be a boon for latency-critical tasks.
- Hash Tables: Typically occupy
$O(N)$ space for$N$ elements. - Tries: Their space requirement stands at
$O(ALPHABET_SIZE \times N)$ , governed by the number of keys and the alphabet's size.
Thetrie is a tree-like data structure optimized for string operations. While it presents distinct advantages, it also comes with specific drawbacks.
Efficient String Operations: The trie achieves
$O(m)$ time complexity for standard operations like insert, search, and delete, outpacing many traditional data structures.Prefix Matching: Designed for rapid prefix matching, the trie is especially effective in auto-completion and spell-check scenarios.
Memory-Saving for Common Prefixes: The trie stores shared prefixes only once, which is advantageous for datasets with repeated string beginnings.
Effective with Repetitive Data: It's ideal for identifying duplicates or repetitions in datasets.
Lexicographical Sorting: The trie structure allows for easy iteration over data in lexicographic order.
Flexible Alphabet Support: The trie isn't limited to alphanumeric data and can handle any consistent set of symbols, such as DNA sequences.
Higher Memory Usage: Compared to some alternatives, the trie can consume more memory, especially for large datasets, due to its node structure.
Insertion Speed: Inserting characters individually can make the insertion process slower than bulk data insertion methods.
Cache Inefficiency: Non-contiguous memory storage of trie nodes might lead to more frequent cache misses.
Implementation Complexity: Its recursive nature and pointer-based setup can make the trie more intricate to implement than simpler structures.
Tries are commonly used in natural language processing to store and look up words efficiently. They arewell-suited for English, which is acase-sensitive language.
Tries can treat characters in several ways:
- Case-sensitive: Distinguishes between upper and lower case.
- Case-insensitive: Ignores case distinctions.
- Folded: Maps all characters to a specific case.
For case-sensitive operation, use tree branches to represent different cases. For each node, there'd be up to 26 children: 13 for lowercase letters and 13 for uppercase letters.
Modern keyboards can differ in the number of keystrokes required for common operations on case-insensitive tries. This influences the popularity of these systems.
The most widely used trie variant is not case-sensitive because it requires fewer memory and computational resources.
In thealphabet:
- Num of Unique Characters:
$26$ (English Standard 26, French 23, Turkish 29, etc.)
In a trie'snode structures:
- Memory Requirement: One Boolean flag plus 26 pointers, demanding 26 times more memory than the simplest case-less variant.
Here is the Python code:
classTrieNode:def__init__(self):self.children= {}self.is_word=FalseclassCaseInsensitiveTrie:def__init__(self):self.root=TrieNode()definsert(self,word):current=self.rootword=word.lower()forcharinword:ifcharnotincurrent.children:current.children[char]=TrieNode()current=current.children[char]current.is_word=Truedefsearch(self,word):current=self.rootword=word.lower()forcharinword:ifcharnotincurrent.children:returnFalsecurrent=current.children[char]returncurrent.is_word
TheTrie data structure is effective for tasks related toexact string matching andtext auto-completion. Let's look at it's real-world applications.
- Auto-Completion: Predict words as a user types, seen in search bars or messaging apps.
- Spell Checking: Quickly identify misspelled words.
- Recommendation Engines: Offer suggestions based on user input or previous choices.
- String Filtering: Efficiently find exact matches, useful for profanity filtering or blacklists.
- Contact Lookups: Search contact lists using names or numbers faster.
- Web Crawlers: Assist in web indexing and data extraction.
- Pattern Matching: Used in algorithms like Aho-Corasick and Rabin-Karp for matching patterns, including in DNA sequences.
Here is the Python code:
classTrieNode:def__init__(self):self.children= {}self.is_word_end=FalseclassAutoComplete:def__init__(self):self.root=TrieNode()definsert_word(self,word):node=self.rootforcharinword:ifcharnotinnode.children:node.children[char]=TrieNode()node=node.children[char]node.is_word_end=Truedefsearch_prefix(self,prefix):node=self.rootforcharinprefix:ifcharnotinnode.children:return []node=node.children[char]returnself._get_words_from_node(node,prefix)def_get_words_from_node(self,node,prefix):results= []ifnode.is_word_end:results.append(prefix)forchar,child_nodeinnode.children.items():results.extend(self._get_words_from_node(child_node,prefix+char))returnresults# Sample Usage:ac=AutoComplete()ac.insert_word("apple")ac.insert_word("app")ac.insert_word("application")print(ac.search_prefix("ap"))# ['app', 'apple', 'application']
Trie, a tree-like data structure, is an optimal choice for buildingautocomplete systems. With each node representing a single character,Tries are especially efficient in dealing with text data.
- Partial Matching: The search process returns results that start with the input characters.
- Dynamic: The trie adapts as text changes.
- Trie Traversal: Starting from the root, navigate the trie to the node representing the last character in the input string.
- Subtree Collection: Using the node from step 1, collect all descendant nodes to achieve partial matching.
- List Generation: Convert the nodes from step 2 to strings, typically utilizing depth-first search.
- Speed: Autocompletion, even in large datasets, is quick.
- Context Sensitivity: Users see suggestions that match the context of words they've already typed.
- Duplication Handling: No duplicate suggestions are provided.
Here is the Python code:
classTrieNode:def__init__(self):self.children= {}self.is_word=FalseclassTrie:def__init__(self):self.root=TrieNode()definsert(self,word):node=self.rootforcharinword:ifcharnotinnode.children:node.children[char]=TrieNode()node=node.children[char]node.is_word=True# Other utility methods for trie operations such as delete.# Example Usaget=Trie()words= ["hello","world","hi","python"]forwordinwords:t.insert(word)
Here is the Python code:
classTrieAutoComplete(Trie):defget_all(self,prefix):node=self.rootforcharinprefix:ifcharinnode.children:node=node.children[char]else:return []returnself._gather_words(node,prefix)def_gather_words(self,node,prefix):results= []ifnode.is_word:results.append(prefix)forchar,child_nodeinnode.children.items():results.extend(self._gather_words(child_node,prefix+char))returnresults# Example Usage for Autocompletiont=TrieAutoComplete()words= ["hello","world","hi","python"]forwordinwords:t.insert(word)print(t.get_all("h"))# Output: ["hello", "hi"]
In the given example, theget_all method uses the prefix to reach the relevant node in the Trie and then uses the private method_gather_words to collect all words that start with the given prefix through a depth-first search.
Tries are advanced data structures, specifically designed for efficient prefix matching. This quality makes them not only powerful but also indispensable in IP routing.
An IPv4 address like192.168.1.1 is commonly represented inbinary using 32 bits, partitioned into four octets. Decimal-to-binary conversion yields:
- 192: 11000000
- 168: 10101000
- 1: 00000001
- 1: 00000001
Concatenating these octets gives the binary form: 11000000101010000000000100000001.
Tries are
ABinary Trie is a specialized kind of trie where each node has either two or no children. This binary arrangement makes them ideal for matching binary IP addresses.
Starting from the root (or root node), you compare each bit of the IP address. A0 bit usually directs the traversal to the left (child node), while a1 bit leads to the right.
When you reach a leaf node, the series of0s and1s you encountered forms the longest matching prefix. Also, if a given node has no children, and it is a leaf node, the path to that node forms theIP Prefix.
Here is the visual representation of how a binary trie looks with three IP addresses.
$1100 0000$ 0000 0000 0000 0000 0000 0001$1100 0000$ 1010 1000 0000 0000 0000 0001$1100 0000$ 10 000000 0000 0000 0000 0000
Longest-prefix matching involves identifying the most specific IP address that matches a given destination address.
Practical uses of longest-prefix matching arise in various network tasks, such as:
- IP address assignments in Network Address Translation (NAT).
- VPN routing tables to find the most suitable forwarding entry in a routing table.
The trie's inherent design for binary strings and its navigational logic of reading incoming bits characterize the steps necessary to accomplish longest-prefix matching using a trie structure.
Initialization: Start From Trienode
rootwhich contains all bits (32-bits long) of the keys. Both keys and Nodes (except leaves) can be thought of as 32-bit long strings, the root thus doesn't represent any key, just the unused bits.**.Comparison: Proceed through the trie by
bitwise ANDof the current node and the next bit in the given IP address.1 -> Right,0 -> Leftare the conditions the comparison affects.Longest Matching Prefix is recorded as the valid IP prefix until the path reaches a leaf node.
ATrie is a powerfultree-like data structure, especially suited for text operations such asspelling correction.
In aspell-checking context, the Trie helps to identify misspelled words by recognizing sequences of letters that do not form valid words.
TheTrie's structure supports spelling-related tasks. Here's how:
- Path Matching: Each node represents a letter. Traversing from the root down a collection of nodes can form a word. This setup aids in comparing input tokens with stored words.
- Comprehensive Coverage: The Trie covers the entire pool of words in its structure. Therefore, it can verify any word against all known words, making it ideal for spell-checking.
The Trie employs techniques such aslevenstein distance andedit distances. These algorithms examine the relationship between words determined by word case, word length, and term frequency (TF-IDF).
Edit distances, also known asLevenshtein distances, represent the number of single-edit steps required to convert one given word into another. Such edits can encompass:
- Substitute: Swapping one letter for another.
- Insert: Placing a new letter within the word.
- Delete: Removing a letter.
- Transpose: Reversing the positions of two adjoining letters.
Implementing these transformations and utilizing the Trie for efficient validation helps correct apparent typos or isolated spelling errors.
Toimplement a trie, various strategies are available:
In this approach, eachtrie node contains anarray where each index corresponds to a character of the alphabet.
For example,0 stands for 'a',1 for 'b', and so on. While simple to understand, this method canconsume more memory, especially if many of the array elements are unused.
Here is the Python code:
classTrieNode:def__init__(self):# Array for 26 letters of the alphabetself.children= [None]*26self.isEndOfWord=False# Initialize the root noderoot=TrieNode()
Here, eachnode corresponds to acharacter, and thechild attribute points to thehead of another linked list, representing the next level of characters.
This approach cansave memory when the number of child nodes is significantly less than the array size of the previous method.
Here is the Python code:
classTrieNode:def__init__(self,key=None,value=None,parent=None,prev=None):self.key=keyself.value=valueself.parent,self.prev,self.next=parent,prev,Noneself.child=None# Initialize the root noderoot=TrieNode()
This technique uses a combination of abitmap and anarray of pointers. The bitmap determines which characters are present at a given node. The array of pointers then has a size equal to the number of set bits in the bitmap.
While this strategy can significantlyreduce the memory footprint, it can add complexity to some operations.
Merging the strengths of the strategies mentioned above can yield a balanced solution. For instance, one could use abitmap to compress the children array in thearray-based approach or combine bitmaps with linked lists.
- Leverage Libraries: There are well-optimized, library-based trie structures that can be advantageous. Using established libraries can streamline the process and enhance efficiency.
- Optimize Based on Use Case: The ideal trie configuration will vary based on its intended application.
The task is to create the code forinserting a word into aTrie data structure.
In aTrie, each node represents a single character of a word. Starting from the root, a path along the nodes denotes a valid word.
- Initialize acurrent node to the root of the Trie.
- For each character in the word, check if the current node has a child node corresponding to that character. If not, create a new node and add it as a child of the current node.
- Move the current node to the child node corresponding to the character and repeat step 2.
- After iterating through the word, mark the current node (which now represents the last character of the word) as anend of the word.
- Time Complexity:
$O(L)$ where$L$ is the length of the word. - Space Complexity:
$O(L)$ since in the worst case, we need to add$L$ new nodes, each for a different character in the word.
Here is the Python code:
classTrieNode:def__init__(self):self.children= {}self.is_end_of_word=FalseclassTrie:def__init__(self):self.root=TrieNode()definsert(self,word):current=self.rootforcharinword:ifcharnotincurrent.children:current.children[char]=TrieNode()current=current.children[char]current.is_end_of_word=True# Usagetrie=Trie()words= ["apple","banana","app","ape","bat"]forwordinwords:trie.insert(word)
The goal is to implement asearch function that can determine whether a particular word is included in aTrie data structure.
The decision to include a word in aTrie structure relies on whether it exists in the dictionary represented by the Trie. In a Trie, the presence of a word is directly associated with the presence of an endmarker ($).
- For Python and JavaScript: the
searchfunction recursively traverses the Trie. - For C++: an iterative approach is often preferred for performance reasons, especially when dealing with very large Tries.
Both methods should identify whether a given word is part of the Trie, hence serving as an efficient spellchecker or dictionary for a set of words.
Time Complexity:
$O(m)$ where$m$ is the length of the key. Each character in the word can be found in its corresponding Trie level, which makes this operation linear.
Space Complexity:
- For Python and JavaScript, the Space Complexity is
$O(m)$ as well, where$m$ is the length of the key, accounting for the call stack during the recursive traversal. - For C++, the Space Complexity can be considered
$O(1)$ as there are no additional data structures influencing it. However, if we include the space taken by critical operations during the search, it will still be$O(m)$ for a particular word.
- For Python and JavaScript, the Space Complexity is
Here is the Python code:
classTrieNode:def__init__(self):self.children= {}self.is_end_word=FalseclassTrie:def__init__(self):self.root=TrieNode()definsert(self,word):node=self.rootforcharinword:ifcharnotinnode.children:node.children[char]=TrieNode()node=node.children[char]node.is_end_word=Truedefsearch(self,word):node=self.rootforcharinword:ifcharnotinnode.children:returnFalsenode=node.children[char]returnnode.is_end_word
Deleting a word in aTrie usually involves setting theis_end_of_word attribute of the last node of the word toFalse. This is necessary, even though the word may not be explicitly present. Consider, for instance, the words "a" and "and" — if "and" is deleted, "a" should remain a word.
In addition to settingis_end_of_word toFalse in the last node of the word, if any nodes becomeredundant (i.e., have no other children or are part of another word), they should be removed from the tree as well.
- Start from the root node.
- For each character
$c$ in the word:- If the child node for
$c$ doesn't exist, the word isn't present in the trie. Return without making any changes. - Navigate to the child node for
$c$ .
- If the child node for
- After reaching the last character of the word, which ends at node
$node_end$ :- If
$node_end$ has no children, traverse back (from the end) and delete nodes with only one child, until the root or a node with multiple children is reached. - Set
$node_end.is_end_of_word$ toFalse.
- If
- Time Complexity:
$O(m)$ where$m$ is the length of the word. We may have to traverse back up the tree to delete nodes, but the maximum number of backtracks is limited by the number of nodes in the trie. - Space Complexity:
$O(1)$
Here is the Python code:
classTrieNode:def__init__(self):self.children= {}self.is_end_of_word=FalseclassTrie:def__init__(self):self.root=TrieNode()definsert(self,word):node=self.rootforcharinword:ifcharnotinnode.children:node.children[char]=TrieNode()node=node.children[char]node.is_end_of_word=Truedefsearch(self,word):node=self.rootforcharinword:ifcharnotinnode.children:returnFalsenode=node.children[char]returnnode.is_end_of_worddefstarts_with(self,prefix):node=self.rootforcharinprefix:ifcharnotinnode.children:returnFalsenode=node.children[char]returnTruedefdelete(self,word):ifnotself.root.children:return# Trie is emptyself._delete_helper(self.root,word,0)def_delete_helper(self,node,word,index):ifindex==len(word):ifnode.is_end_of_word:node.is_end_of_word=Falseifnotnode.children:delnode# Delete redundant nodesreturnnotnode.children# True if node has no childrenreturnFalsechar=word[index]ifcharnotinnode.children:returnFalseshould_delete_current_node=self._delete_helper(node.children[char],word,index+1)ifshould_delete_current_node:delnode.children[char]returnnotnode.children# True if node has no other children after deletionreturnFalse# Exampletrie=Trie()words= ["hello","world","python","programming"]forwordinwords:trie.insert(word)trie.delete("python")print(trie.search("python"))# Output: False
WhileTries are specialized for tasks involving strings and are especially efficient for datasets with shared prefixes,BSTs are versatile, general-purpose trees that can store any ordered data.
- Trie: This is determined by the length of the word/key being looked up. Hence, the time complexity is
$O(m)$ , where$m$ is the length of the key. - BST: Efficient look-ups in balanced BSTs are
$O(\log n)$ , but if the BST becomes skewed, it degrades to$O(n)$ .
- Trie: Insertion and deletion are typically
$O(m)$ , with$m$ being the key's length. - BST: Insertion and deletion are
$O(\log n)$ in a balanced tree. However, in the worst-case scenario (unbalanced tree), these operations can take$O(n)$ time.
- Trie: Often more space-efficient, especially when dealing with datasets having short keys with common prefixes. It can save considerable space by sharing common prefix nodes.
- BST: Every node in the BST requires storage for its key and pointers to its two children. This fixed overhead can make BSTs less space-efficient than tries, especially for large datasets.
- Trie: Excels at operations like longest-prefix matching, making it an ideal choice for applications such as autocompletion, IP routing, and more.
- BST: While not specialized like tries, BSTs are more general-purpose and can handle a wider range of tasks.
- Trie: Inherently balanced, making them relatively low-maintenance. This ensures consistent performance without the need for additional balancing algorithms.
- BST: To maintain efficient operation, BSTs often require balancing using specific algorithms or tree structures like AVL or Red-Black trees. Without periodic balancing, the tree could become skewed, leading to suboptimal performance.
BothHash Tables andTries are data structures used for efficient data storage and retrieval.
Whilehash tables offer fast lookups and unordered data storage,tries are optimized for ordered tasks and text-related functions.
- Definition: Uses a hash function to map keys to unique values for fast retrieval.
- Best For: Unordered data and quick lookups.
- Pros: Fast O(1) lookups, memory efficiency.
- Cons: Lack of ordering, potential for collisions.
- Definition: Organizes keys based on common prefixes and is often used for text-related tasks like auto-completion.
- Best For: Maintaining order and tasks involving text.
- Pros: Efficient prefix matching, ordered retrieval.
- Cons: Memory-intensive.
Here is the Python code:
# Hash Tableword_dict= {"hello":"English greeting","hola":"Spanish greeting"}# TriefromcollectionsimportdefaultdictclassTrieNode:def__init__(self):self.children=defaultdict(TrieNode)self.is_word=Falseroot=TrieNode()words= ["hello","hola"]forwordinwords:node=rootforcharinword:node=node.children[char]node.is_word=True
ASuffix Trie extends the functionality of a regular trie to capture allsuffices of a string. Both Tries are structures well-suited for substrings and searching in text.
Rather than simply storing individual letters, aSuffix Trie captures complete substrings from the input word. Each node in theSuffix Trie represents a substring, starting from the root that denotes the complete word and ending with terminal nodes signifying various suffixes.
In a regular trie, word "BANANA" would be represented as follows:
root | B / \ A R / \ N E / AIn aSuffix Trie for "BANANA," the main root still represents the complete word. However, it has specialized edges, known as suffix links or paths, leading to nodes that correspond tosuffixes of the word:
root / \ A.. ..NA / \ NA B..A / \ ANA NA / \NA A \ / A NALeaf Nodes: In a regular trie, leaves often serve as indicators for complete words. In aSuffix Trie, leaf nodes invariably denote the end of a suffix.
Internal Nodes: These nodes inSuffix Tries hold multiple incoming edges, including those from the root, whereas in regular tries, they typically have a one-to-one edge-to-substring relationship.
Terminating Characters: Regular tries often rely on designated characters like "end of word" flag (e.g.,
'$') to mark the end of words.Suffix Tries, in contrast, determine word boundaries based on incorporating all suffixes of the input.
- Trie: Optimized for rapid prefix-based search, common in tasks such as autocompletion.
- Suffix Trie: Primarily used to facilitate swift pattern matching in texts, particularly in bioinformatics and data compression.
Here is the Python code:
# Regular TrieclassNode:def__init__(self):self.children= {}self.is_end=Falsedefinsert_word(root,word):node=rootforcharinword:ifcharnotinnode.children:node.children[char]=Node()node=node.children[char]node.is_end=True# Suffix TrieclassSuffixNode:def__init__(self):self.children= {}self.suffix_link=Nonedefinsert_suffix(root,suffix):node=rootforcharinsuffix:ifcharnotinnode.children:node.children[char]=SuffixNode()node=node.children[char]# Establishing the suffix link is algorithmically more complex
Explore all 28 answers here 👉Devinterview.io - Trie Data Structure

About
🟣 Trie Data Structure interview questions and answers to help you prepare for your next data structures and algorithms interview in 2025.
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
.png%3falt%3dmedia%26amp%3btoken%3db216a7e0-1ec9-4a12-b394-80ae45ffaf5e%26amp%3b_gl%3d1*1h44uz2*_ga*OTYzMjY5NTkwLjE2ODg4NDM4Njg.*_ga_CW55HF8NVT*MTY5NzM2MjgxMi4xNTguMS4xNjk3MzYzMjgyLjEuMC4w&f=jpg&w=240)