11
\$\begingroup\$

Introduction

This is yet another data structure I'm going over again for the algorithms course. This time it is binary search tree.

Implemented operations:

  • Insert

  • Exists

  • Remove

  • Clear

  • Move constructor and assignment, destructor

There are some tests for remove function below the data structure itself. I tested other functions, but the tests got overridden in the interim. They did pass though. I believe automated ones are not possible until I make the tree somehow traversable, but that's adventure for another time.

Concerns

  • Extreme code duplication

    The 4 case functions (nullptr, equal, less, greater) share the same control flow statements. I believe abstracting that away would make it worse though.

  • Extremely complicated remove function

    This one took around an hour of my time to get correct, from starting to write it to debugging the three cases I found and tested for.

  • Overall just feels bad

    It just gives that feeling. Or may be most of the algorithms I've seen are much more elegant than this.


Code

#include <stdexcept>#include <type_traits>#include <ostream>#include <utility>template <typename ValueType>class binary_search_tree{    struct node    {        const ValueType value;        node* left;        node* right;    };    enum class direction    {        is_root,        left,        right    };    struct search_result    {        node* parent;        node* target_child;        direction parent_to_child;    };    node* root;public:    binary_search_tree() :            root(nullptr)    {}    binary_search_tree(const binary_search_tree& other) = delete;    binary_search_tree& operator=(const binary_search_tree& other) = delete;    binary_search_tree(binary_search_tree&& other) :            root(std::exchange(other.root, nullptr))    {}    binary_search_tree& operator=(binary_search_tree&& other) noexcept    {        std::swap(root, other.root);        return *this;    }    bool try_insert(const ValueType& value)    {        return try_insert_helper(value, root);    }    bool exists(const ValueType& value)    {        return find_node(value, nullptr, root, direction::is_root).target_child != nullptr;    }    bool delete_if_exists(const ValueType& value)    {        auto [parent_node, node_with_value, parent_to_child] =                find_node(value, nullptr, root, direction::is_root);        if (node_with_value == nullptr)            return false;        if (node_with_value->left == nullptr)        {            auto old = node_with_value;            switch (parent_to_child)            {                case direction::left:                    parent_node->left = node_with_value->left;                    break;                case direction::right:                    parent_node->right = node_with_value->right;                    break;                case direction::is_root:                    root = root->right;            }            delete old;            return true;        }        if (node_with_value->left->right == nullptr)        {            switch (parent_to_child)            {                case direction::left:                    parent_node->left = node_with_value->right;                    node_with_value->right->left = node_with_value->left;                    break;                case direction::right:                    parent_node->right = node_with_value->right;                    node_with_value->right->left = node_with_value->left;                    break;                case direction::is_root:                    root->left->right = root->right;                    root = root->left;            }            delete node_with_value;            return true;        }        auto [suitable_parent, suitable_node] =                find_suitable_node(node_with_value->left->right, node_with_value->left);        switch (parent_to_child)        {            case direction::left:                parent_node->left = suitable_node;                suitable_node->right = node_with_value->right;                suitable_node->left = node_with_value->left;                break;            case direction::right:                parent_node->right = suitable_node;                suitable_node->right = node_with_value->right;                suitable_node->left = node_with_value->left;                break;            case direction::is_root:                suitable_node->right = root->right;                suitable_node->left = root->left;                root = suitable_node;        }        suitable_parent->right = nullptr;        delete node_with_value;        return true;    }    void clear()    {        clear_helper(root);    }    void inorder_print(std::ostream& os)    {        if (root == nullptr)            return;        inorder_print_helper(os, root);    }    ~binary_search_tree()    {        clear();    }private:    std::pair<node*, node*> find_suitable_node(node* start_position, node* parent)    {        if (start_position->right == nullptr)            return {parent, start_position};        return find_suitable_node(start_position->right, start_position);    }    void clear_helper(node* start_position)    {        if (start_position == nullptr)            return;        clear_helper(start_position->left);        clear_helper(start_position->right);        delete start_position;    }    search_result find_node(const ValueType& value,                            node* parent,                            node* current_node,                            direction parent_to_child)    {        if (current_node == nullptr)            return {nullptr, nullptr, direction::is_root};        if (current_node->value == value)            return {parent, current_node, parent_to_child};        if (value < current_node->value)            return find_node(value, current_node, current_node->left, direction::left);        else            return find_node(value, current_node, current_node->right, direction::right);    }    bool exists_helper(const ValueType& value,                       node* current_node)    {        if (current_node == nullptr)            return false;        if (current_node->value == value)            return true;        if (value < current_node->value)            return exists_helper(value, current_node->left);        else            return exists_helper(value, current_node->right);    }    void inorder_print_helper(std::ostream& os,                              node*& current_node)    {        if (current_node == nullptr)            return;        inorder_print_helper(os, current_node->left);        os << current_node->value << ' ';        inorder_print_helper(os, current_node->right);    }    bool try_insert_helper(const ValueType& value,                           node*& current_node)    {        if (current_node == nullptr)        {            current_node = new node{value};            return true;        }        if (current_node->value == value)            return false;        if (current_node->value > value)            return try_insert_helper(value, current_node->left);        else            return try_insert_helper(value, current_node->right);    }};#include <iostream>#include <sstream>void test_remove_case_one(){    binary_search_tree<int> tree;    tree.try_insert(2);    tree.try_insert(3);    tree.try_insert(1);    tree.try_insert(4);    tree.try_insert(-2);    tree.try_insert(0);    tree.delete_if_exists(3);    std::ostringstream oss;    tree.inorder_print(oss);    if (oss.str() != "-2 0 1 2 4 ")        throw std::logic_error("remove case one fails");}void test_remove_case_two(){    binary_search_tree<int> tree;    tree.try_insert(4);    tree.try_insert(7);    tree.try_insert(11);    tree.try_insert(1);    tree.try_insert(-2);    tree.try_insert(0);    tree.delete_if_exists(4);    std::ostringstream oss;    tree.inorder_print(oss);    if (oss.str() != "-2 0 1 7 11 ")        throw std::logic_error("remove case two fails");}//almost like case 2, but has three added in itvoid test_remove_case_three(){    binary_search_tree<int> tree;    tree.try_insert(4);    tree.try_insert(7);    tree.try_insert(11);    tree.try_insert(1);    tree.try_insert(-2);    tree.try_insert(0);    tree.try_insert(3);    tree.delete_if_exists(4);    std::ostringstream oss;    tree.inorder_print(oss);    if (oss.str() != "-2 0 1 3 7 11 ")        throw std::logic_error("remove case two fails");}int main(){    std::cout << "running remove case 1...\n";    test_remove_case_one();    std::cout << "remove case 1 passed successfuly\n";    std::cout << "running remove case 2...\n";    test_remove_case_two();    std::cout << "remove case 2 passed successfuly\n";    std::cout << "running remove case 3...\n";    test_remove_case_three();    std::cout << "remove case 3 passed successfuly\n";}

Explanations

I believe the implementation guidelines are very important, so I decided to keep them here, as the safest place to keep notes for me is SE posts (I know it is quite weird). I included pictures for remove, as it is not as straightforward as others (pictures go over the same examples as in the code).

insert

Quite easy. Launch a recursion. If the current node isnullptr, insert at this node and return true (keep in mind that all pointers are passed by reference, thus the change will be reflected in the data structure itself. Also they always exist, no dangling references). If the value to-be-inserted is less than value in the node (IN THIS ORDER!), search right location to insert in the left subtree. If greater, search the location in the right subtree. If the value is equal, then return false.

exists

Almost the same like insertion, but the true/false cases are reversed. If value is equal to the value in the node, return true. If node isnullptr, return true (nowhere else to search).

remove

While searching for the node-to-remove, return these important three values: parent node of the target node, target node itself, the direction from parent to target child. The suitable child to replace with the to-be-removed node is the one that is greater than left child and less than right child. If the value is in the tree, there are 3 cases (others are covered by these):

  • to-be-removed node doesn't have left child (doesn't have suitable child)

Easy case. Relink parent of the to-be-removed node to the right child of the to-be-removed node (keep in mind DIRECTION!). Delete the to-be-removed node. Don't forget to update root if the to-be-removed node is root.

case 1

  • to-be-removed node has left child, but the child doesn't have right child (has suitable child which is left child of to-be-removed node)

Somewhat easy too. Relink parent to the suitable child, change the right child of suitable child to the right child of to-be-removed node. Delete to-be-removed node. Update root if it is affected.

case 2

  • to-be-removed node has suitable child which is far (> 1 depth) away

Find the rightmost child of the left child of to-be-removed node. Make sure the parent of the suitable child is no longer linked to the suitable child. Relink parent of the to-be-removed node to the suitable child. Relink left and right of the to-be-removed node the left and right of the suitable child, respectively. Delete to-be-removed node.

case 3_1

enter image description here

clear

If current node isnullptr, return. Go to the left, then to the right. Finally delete the current node.

askedJun 3, 2018 at 21:30
Incomputable's user avatar
\$\endgroup\$
3
  • \$\begingroup\$Let me know if pictures are too big. I'll try to do something about it tomorrow. Sorry for the terrible handwriting.\$\endgroup\$CommentedJun 3, 2018 at 22:13
  • 1
    \$\begingroup\$I was actually going to comment on hownice your handwriting is!\$\endgroup\$CommentedJun 3, 2018 at 23:46
  • \$\begingroup\$@erip, thanks! I believe I should work on the style consistency (e.g. same height of lower case letters and same width of the same letters). I've got 20 more lectures to go through on youtube, so I'll probably be taking more notes.\$\endgroup\$CommentedJun 3, 2018 at 23:56

2 Answers2

5
\$\begingroup\$

The first of your problems, namely code duplication is stemmed from the wrong interface.exists_helper returning a boolean violates a very important (and unfortunately little known) principle: do not throw away the information you computed. In case ofexists_helper computed is an insertion point for the value you did not find. Returningthat makesinsert_helper, and thefind, anddelete thin wrappers aroundexists_helper.

Along the same line,insert shall return an indication of success/failurealong with the node itself, STL style. It doesn't really matter in case of integers, but for more complexValueType we are usually interested in what did prevent insertion.


I don't approve recursive nature ofexists andinsert. They are tail recursive, and better be written as iterative.


Regarding deletion, I think you over engineered it. Consider a pseudocode:

    if left child exists,        find the rightmost descendant        rightmost descendant's->right = right child        return left child as new root    else        return right child as new root

Note that yourcase 2 doesn't need to be special cased: left child itself is the rightmost descendant.

Incomputable's user avatar
Incomputable
9,7223 gold badges34 silver badges73 bronze badges
answeredJun 3, 2018 at 23:05
vnp's user avatar
\$\endgroup\$
2
  • \$\begingroup\$I remember the principle being mentioned by Marshall Clow on one of the CppCon talks. I agree that I could make some functions iterative. I'll add input iterator support, lower_bound and upper_bound and then will post again with the suggestions from this post.\$\endgroup\$CommentedJun 4, 2018 at 0:26
  • \$\begingroup\$I once wrote a tail-recursive algorithm for binary trees, then rewrote it as an iterative algorithm. The recursive version was faster. Maybe it’s me... :)\$\endgroup\$CommentedJun 4, 2018 at 1:33
4
\$\begingroup\$
binary_search_tree() :        root(nullptr){}

Use default initialization of the members, in-line in the class. If you have:

node* root = nullptr;

then your default constructor will be generated automatically.

enum class direction{    is_root,    left,    right};

Last time I implemented a type of binary tree (the outer layer of a “T Tree”, with auto-balancing inserts and deletes) I took advantage of thesymmetry between left and right to not write all the code twice. Instead of separately namedleft andright child nodes, I made an array of 2 children, so I havechild[0] andchild[1].

All the logic that deals with left vs. right is mirrored with the key less vs greater. So I implement it once using the abstract S and D instead of Left and Right, and switch which is which in the key comparison. That is, on the less-than case, S is left and D is right; on the greater-than case, the other way around. Since their values are 0 and 1, the conditional code just sets S and then D is always 1-S.child[S] will be either left or right, depending on the choice. Follow me?

Although, it is the balancing code that is most of the work (and this eliminated many test cases too since it was all the same). If you do AVL tree or the like next, keep that in mind!


parent_to_child is unusual. It means you need the follow-up switch statement indelete_if_exists and the extra state indirection (or maybe the whole enum at all, in your case). I think it is more normal to return a pointer to the parent as well as a pointer to the found node (if any). That also works when you do not have aparent pointer in each node (the purest form of binary tree does not).


    if (node_with_value == nullptr)    if (node_with_value->left == nullptr)

Don’t compare directly againstnullptr. Use the contextual conversion to bool as a truth value that is meant for this purpose. This is especially when you start using smart pointers, as they implement an efficientoperator bool rather than needing to do a comparison directly.

answeredJun 3, 2018 at 23:03
JDługosz's user avatar
\$\endgroup\$

You mustlog in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.