- Notifications
You must be signed in to change notification settings - Fork1
Implementations of AVL and 2-3 trees
License
nizamiza/self-balancing-trees
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is part of the university project at FIIT STU, Bratislava, Slovakia. Goalis to compare different self-balancing trees to each other, as well as to theregular Binary Search Tree.
This project comes with a small interaction program -treeio
. It can be usedto manually insert and search nodes, as well aspretty print the whole tree.
Project contains aMakefile
to compiletreeio
program. Default setting isto compile it with AVL tree, but you can change this behavior:
- You can manually specify default behavior in Makefile.
- You can pass
target=[tree_name]
argument tomake
command.
If you want to compile program with different tree, then you also need tochange#include
statement insidetreeio.c
to match tree that you want tocompile. E.g., if you want to compile 2-3 tree, then you would include2_3tree.h
header and run:
$ make target=2_3tree
Tree type | Target name |
---|---|
BST | bst |
AVL | avltree |
2-3 | 2_3tree |
Red-black | rbtree |
Target names are folder names that contain source files of the given tree.
If you wish to run unit tests, you can add scenarios insideassert file ofthe tree and include its header intreeio.c
. You will also be required topass additional argument tomake
command in shell.
For example, if you've includedavltree_assert.h
header intotreeio.c
andwish to run tests. Then you can add function call insidemain
:
/* treeio.c */intmain(void){run_internal_tests();...}
This will run scenarios specified in assert file.
To compile program with tests enabled, you can run shell command:
$ make assert=true
AVL tree, where AVL stands for its creators -Adelson-Velsky and Landis, isa self balancing binary search tree. It is was the first such tree to be invented.
AVL tree hasO(logn)
time complexity for its operations. It is often comparedtored-black tree due to them both being height balanced.
Each node in AVL tree has abalance factor, which is calculated by substractingthe height of the left subtree from the height of the right subtree. I.e.:
B(N) = H(R(N)) - H(L(N))
,
whereB(N)
is balance factor of the nodeN
,R(N)
is the right subtree ofthe nodeN
,L(N)
is the left subtree of the nodeN
, andH(T)
is afunction to calculate height of the subtreeT
.
Binary search tree is defined as an AVL tree, if and only if, each node of thetree has balance factor of0,1, or-1. NodeN
with balance factorof-1 isleft heavy, nodeN
with balance factor of1 isrightheavy, nodeN
with balance factor of0 isbalanced.
This implementation is based onwikipedia article as well asgeeksforgeeks article about AVL tree. This implementation includes onlysearch andinsert operations.
For each node we can define a structure:
struct_node {struct_node*left;struct_node*right;charbfactor;longkey;};
wherebfactor
stands for thebalance factor.
Search routine is the same as for the regular Binary Search Tree, so we willskip it and go right to the insertion.
Insertion operation of the AVL tree is similar to one of regular binary searchtree. Simple recursive traversal will do the trick. Important bit comes afterthe fact of insertion, which is the rebalancing of tree.
Each time we insert a new node, we calculate balance factor of that node, andif it is not-1,0, or1, then we need to rebalance our tree. Thiscan be done with node rotation operations, such asleft rotation, andright rotation.
To left rotate node is to set itsright subtree toleft subtree of itsright subtree, and set itself as theleft subtree of itsright subtree. I.e.:
struct_node*rotate_left(struct_node*node){struct_node*right=node->right;node->right=right->left;right->left=node;set_bfactor(node);returnset_bfactor(right);}
Right rotation is the same process as the left rotation, but instead of settingright subtree of the node, we set itsleft subtree to right subtree of itsleft subtree. And after that, similar to the left rotation, we set the nodeitself as theright subtree of itsleft subtree. I.e.:
struct_node*rotate_right(struct_node*node){struct_node*left=node->left;node->left=left->right;left->right=node;set_bfactor(node);returnset_bfactor(left);}
When new node is inserted to AVL tree, there are4 cases:
- Left left case.
- Left Right case.
- Right Right case.
- Right Left case.
Based on these cases, we can define rebalancing routine as follows:
struct_node*rebalance(struct_node*node,longkey){if (!node)returnNULL;if (node->left&&node->bfactor>1) {if (key<node->left->key)returnrotate_right(node);node->left=rotate_left(node->left);returnrotate_right(node);}if (node->right&&node->bfactor<-1) {if (key>node->right->key)returnrotate_left(node);node->right=rotate_right(node->right);returnrotate_left(node);}returnnode;}
2-3 tree is a self-balancing search tree, where each node (often referred toas the internal node) has either2 children and1 key orthree childrenand2 keys.
Leaves of 2-3 tree are always on the level, meaning that tree is always balanced.Traversal algorithm of 2-3 tree goes as follows:
- Let
T
be a 2–3 tree andd
be the data element we want to find. IfT
isempty, thend
is not inT
and we're done. - Let
r
be the root ofT
. - Suppose
r
is a leaf. Ifd
is not inr
, thend
is not inT
.Otherwise,d
is inT
. In particular,d
can be found at a leaf node.We need no further steps and we're done. - Suppose
r
is a 2-node with left childL
and right childR
. Let e be thedata element inr
. There are three cases:- If
d
is equal to e, then we've foundd
inT
and we're done. - If
d < e
, then setT
toL
, which by definition is a 2–3 tree, andgo back to step 2. - If
d > e
, then setT
toR
and go back to step 2.
- If
- Suppose
r
is a 3-node with left childL
, middle childM
, and rightchildR
. Leta
andb
be the two data elements ofr
, wherea < b
.There are four cases:- If
d
is equal toa
orb
, thend
is inT
and we're done. - If
d < a
, then setT
toL
and go back to step 2. - If
a < d < b
, then setT
toM
and go back to step 2. - If
d > b
, then setT
toR
and go back to step 2.
- If
To insert a new node to the tree, we search for the correct position for thenode with traversal algorithm and insert it there. If the node becomes a 4-node,then we split it into 2 nodes, promoting middle key to the parent node. Thisprocess repeats recursively until we reach the 2-node node, or the root, inwhich case we split the root as well, making a new one out of the middle key.
This implementation is based onpaperwork, which is part of the CS230 DataStructures course at the Princeton University. This implementation includes onlysearch andinsert operations.
For each internal node we can define a structure:
struct_node {boolisfull;intlow_key;inthigh_key;struct_node*left,*middle,*right;struct_node*parent;};
whereisfull
indicates whether node is 3-node (has 2 keys and 3 children).
Search routine for the 2-3 tree is similar to the regular BST, only differencebeing that we have an extra condition for the recursive call when the node isa 3-node, and search value is between keys of the node. I.e.:
struct_node*search(struct_node*node,intkey){if (!node)returnNULL;if (key==node->low_key|| (node->isfull&&key==node->high_key))returnnode;if (key<node->low_key)returnsearch(node->left,key);if (node->right&&key>node->high_key)returnsearch(node->right,key);returnsearch(node->middle,key);}
Insertion is the most complex part of this implementation. Steps are the sameas described in theIntroduction. After traversing the tree and finding a spotfor the new node, we insert the key:
struct_node*_insert_key(struct_node*node,intkey){if (node->isfull) {node=_split(node,key);}else {node=_add_key(node,key);}return_get_root(node);}
If node is not full, then we just compare new key with the existing key andinsert it to the correct position. If new key is a duplicate, then we throwEDUPNODE
(error duplicate node) and simply return unchanged node:
Throw is a bit misleading, since it will only print an error message'
struct_node*_add_key(struct_node*node,intkey){if (key==node->low_key) {throw(EDUPNODE);returnnode;}if (node->low_key<key) {node->high_key=key;}else {node->high_key=node->low_key;node->low_key=key;}return (node->isfull= true,node);}
But if node is full, then we are at the step5 of our insertion algorithm,meaning we have to split our node. To do so, we sort all 3 keys (two keys ofthe node, and the key that is being inserted) and create2 new nodes fromthe smallest and the largest keys. The middle size key gets promoted to theparent:
struct_node*_split(struct_node*node,intkey){if (key==node->low_key||key==node->high_key) {throw(EDUPNODE);returnnode;}int*keys=_sort_keys(node->low_key,node->high_key,key);intprom_key=keys[1];struct_node*left,*middle;left=_node((struct_node) {.isfull= false,.low_key=keys[0],.parent=node->parent});middle=_node((struct_node) {.isfull= false,.low_key=keys[2],.parent=node->parent});free(keys);return_merge_with_parent((struct_merge_args) {.node=node,.left=left,.middle=middle,.prom_key=prom_key});}
After that we need to merge our2 new nodes with the parent node, as wellas add the promoted key to its list of keys. This step can be divided into3 scenarios:
- Node does not have a parent.
- Parent node is not full (2-node).
- Parent node is full (3-node).
In thefirst case, we simply return a new node, which has a single key(promoted key) and two children (newly made 2 nodes).
In thesecond case, we add promoted key to the parent and attach2 newchildren alongside his single child.
In thethird case, we split the parent itself and promote its middle keyto its parent.
As expectedly, this process repeats recursively until we reach the first 2-nodenode, or the root.
struct_node*_merge_with_parent(struct_merge_argsargs){__DESTRUCT_MERGE_ARGS__;if (!node->parent) {left->parent=middle->parent=node;return_update_node(node, (struct_node) {.isfull= false,.low_key=prom_key,.left=left,.middle=middle});}struct_node*parent=node->parent;free(node);if (!parent->isfull) {if (parent->low_key<prom_key) {parent->high_key=prom_key;parent->middle=left;parent->right=middle;}else {parent->high_key=parent->low_key;parent->low_key=prom_key;parent->left=left;parent->right=parent->middle;parent->middle=middle;}return (parent->isfull= true,parent);}intmerge_prom_key;struct_node*parent_left,*parent_middle;if (prom_key<parent->low_key) {parent_left=_node((struct_node) {.isfull= false,.low_key=prom_key,.left=left,.middle=middle,.parent=parent->parent});parent_middle=_node((struct_node) {.isfull= false,.low_key=parent->high_key,.left=parent->middle,.middle=parent->right,.parent=parent->parent});merge_prom_key=parent->low_key;}elseif (prom_key>parent->high_key) {parent_left=_node((struct_node) {.isfull= false,.low_key=parent->low_key,.left=parent->left,.middle=parent->middle,.parent=parent->parent});parent_middle=_node((struct_node) {.isfull= false,.low_key=prom_key,.left=left,.middle=middle,.parent=parent->parent});merge_prom_key=parent->high_key;}else {parent_left=_node((struct_node) {.isfull= false,.low_key=parent->low_key,.left=parent->left,.middle=left,.parent=parent->parent});parent_middle=_node((struct_node) {.isfull= false,.low_key=parent->high_key,.left=middle,.middle=parent->right,.parent=parent->parent});merge_prom_key=prom_key;}return_merge_with_parent((struct_merge_args) {.node=parent,.left=parent_left,.middle=parent_middle,.prom_key=merge_prom_key});}
Red-black tree is a self-balancing binary search tree, where each node iscolored eitherred or black:
- Root of the red-black tree is always black.
- Each leaf is colored black.
- If node is red, then both of its children are black.
- Every path from a given node to any of its descendant NIL nodes goes throughthe same number of black nodes.
Like AVL tree, Red-black tree is height balanced. The balancing rule is:
The path from the root to the farthest leaf is no more than twice as long asthe path from the root to the nearest leaf
This implementation was taken fromwww.programiz.com and it serves only forcomparison purposes. Although, several alterations to the source code had to bemade. These were primarily to adapt code for the testing and comparison.Important, these changes did not affect unerlying logic nor performance ofthe implementation:
- global variable
root
was removed; insertion
function was renamed toinsert
, and was tweaked to work withoutglobal variable;deletion
function was renamed todelete
and tweaked to work without globalvariable (although this function iss never used);- additional
search
function was implemented; - additional
print_node
function was implemented; - additional
print
function was implemented.
This implementation defined an internal node as a structure:
structrbNode {intdata,color;structrbNode*link[2];};
All three implementations were tested for time efficiency during insertion andsearch with set of keys listed in therandnumbers
file in thedata
directoryof the project. Results were following:
Tree type | Average insertion time | Average search time | Node size |
---|---|---|---|
BST | ~0.000060s | ~0.000002s | 24 bytes |
AVL | ~0.000160s | ~0.000001s | 24 bytes |
2-3 | ~0.000050s | ~0.000001s | 48 bytes |
Red-black | ~0.000052s | ~0.000002s | 24 bytes |
As we can see, thefastest insertions were in2-3 and Red-black trees,although 2-3 tree requires twice the size of memory for each node compared tored-black or the AVL trees.
Slowest insertions were in the AVL tree, but in comparison to the red-blacktree, AVL has better height balance, thus look up operations are faster insideAVL tree.
Nearly each function of the implementations was tested with unit tests (exceptfor the Red-black tree and the regular BST). Test functions including some testscenarios can be found in the same directory of the source code of theimplementation. I.e., tests for the 2-3 tree implementation, are located in2_3tree/2_3tree_assert.c
.
- en.wikipedia.org, AVL tree
- www.geeksforgeeks.org, AVL tree | Set 1 (Insertion)
- en.wikipedia.org, 2-3 tree
- CS230 Data Structures, Prof. Lyn Turbak, 2-3 Trees
- en.wikipedia.org, Red-black tree
- www.programiz.com, Red-black tree
2020, FIIT STU, Bratislava, Slovakia.
About
Implementations of AVL and 2-3 trees
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.