Movatterモバイル変換

[0]ホーム

Jump to content

Weight-balanced tree

Edit links

From Wikipedia, the free encyclopedia

Self-balancing binary search tree

For other uses, seeOptimal binary search tree.

Weight-balanced tree

Type

Tree

Invented

1972

Invented by

de:Jürg Nievergelt andEdward Reingold

Complexities inbig O notation

Space complexity
Space	$O(n)$
Time complexity
Function	Amortized	Worst case
Search	$O(\log n)$	$O(\log n)$
Insert	$O(\log n)$	$O(\log n)$
Delete	$O(\log n)$	$O(\log n)$

Incomputer science,weight-balanced binary trees (WBTs) are a type ofself-balancing binary search trees that can be used to implementdynamic sets,dictionaries (maps) and sequences.^[1] These trees were introduced by Nievergelt and Reingold in the 1970s astrees of bounded balance, orBB[α] trees.^[2]^[3] Their more common name is due toKnuth.^[4]

A well known example is aHuffman coding of acorpus.

Like other self-balancing trees, WBTs store bookkeeping information pertaining to balance in their nodes and performrotations to restore balance when it is disturbed by insertion or deletion operations. Specifically, each node stores the size of the subtree rooted at the node, and the sizes of left and right subtrees are kept within some factor of each other. Unlike the balance information inAVL trees (using information about the height of subtrees) andred–black trees (which store a fictional "color" bit), the bookkeeping information in a WBT is an actually useful property for applications: the number of elements in a tree is equal to the size of its root, and the size information is exactly the information needed to implement the operations of anorder statistic tree, viz., getting then'th largest element in a set or determining an element's index in sorted order.^[5]

Weight-balanced trees are popular in thefunctional programming community and are used to implement sets and maps inMIT Scheme,SLIB,SML-NJ, and implementations ofHaskell.^[6]^[4]

Description

[edit]

A weight-balanced tree is a binary search tree that stores the sizes of subtrees in the nodes. That is, a node has fields

key, of any ordered type
value (optional, only for mappings)
left,right, pointer to node
size, of type integer.

By definition, the size of a leaf (typically represented by anil pointer) is zero. The size of an internal node is the sum of sizes of its two children, plus one: (size[n] = size[n.left] + size[n.right] + 1). Based on the size, one defines the weight to beweight[n] = size[n] + 1.^[a] Weight has the advantage that the weight of a node is simply the sum of the weights of its left and right children.

Operations that modify the tree must make sure that the weight of the left and right subtrees of every node remain within some factorα of each other, using the samerebalancing operations used inAVL trees: rotations and double rotations. Formally, node balance is defined as follows:

A node isα-weight-balanced ifweight[n.left] ≥ α·weight[n] andweight[n.right] ≥ α·weight[n].^[7]

Here,α is a numerical parameter to be determined when implementing weight balanced trees. Larger values ofα produce "more balanced" trees, but not all values ofα are appropriate; Nievergelt and Reingold proved that

\alpha <1-{\frac {\sqrt {2}}{2}}\approx 0.29289

is a necessary condition for the balancing algorithm to work. Later work showed a lower bound of2⁄11 forα, although it can be made arbitrarily small if a custom (and more complicated) rebalancing algorithm is used.^[7]

Applying balancing correctly guarantees a tree ofn elements will have height^[7]

h\leq \log _{\frac {1}{1-\alpha }}n={\frac {\log _{2}n}{\log _{2}\left({\frac {1}{1-\alpha }}\right)}}=O(\log n)

Ifα is given its maximum allowed value, the worst-case height of a weight-balanced tree is the same as that of a red–black tree at $2\log _{2}n$ .

The number of balancing operations required in a sequence ofn insertions and deletions is linear inn, i.e., balancing takes a constant amount of overhead in anamortized sense.^[8]

While maintaining a tree with the minimum search cost requires four kinds of double rotations (LL, LR, RL, RR as in an AVL tree) in insert/delete operations, if we desire only logarithmic performance, LR and RL are the only rotations required in a single top-down pass.^[9]

Set operations and bulk operations

[edit]

Several set operations have been defined on weight-balanced trees:union,intersection andset difference. Then fastbulk operations on insertions or deletions can be implemented based on these set functions. These set operations rely on two helper operations,Split andJoin. With the new operations, the implementation of weight-balanced trees can be more efficient and highly-parallelizable.^[10]^[11]

Join: The functionJoin is on two weight-balanced treest₁ andt₂ and a keyk and will return a tree containing all elements int₁,t₂ as well ask. It requiresk to be greater than all keys int₁ and smaller than all keys int₂. If the two trees have the balanced weight,Join simply create a new node with left subtreet₁, rootk and right subtreet₂. Suppose thatt₁ has heavier weight thant₂ (the other case is symmetric).Join follows the right spine oft₁ until a nodec which is balanced witht₂. At this point a new node with left childc, root k and right childt₂ is created to replace c. The new node may invalidate the weight-balanced invariant. This can be fixed with a single or a double rotation assuming $\alpha <1-{\frac {1}{\sqrt {2}}}$
Split: To split a weight-balanced tree into two smaller trees, those smaller than keyx, and those larger than keyx, first draw a path from the root by insertingx into the tree. After this insertion, all values less thanx will be found on the left of the path, and all values greater thanx will be found on the right. By applyingJoin, all the subtrees on the left side are merged bottom-up using keys on the path as intermediate nodes from bottom to top to form the left tree, and the right part is symmetric. For some applications,Split also returns a Boolean value denoting ifx appears in the tree. The cost ofSplit is $O(\log n)$ , order of the height of the tree. This algorithm actually has nothing to do with any special properties of a weight-balanced tree, and thus is generic to other balancing schemes such asAVL trees.

The join algorithm is as follows:

function joinRightWB(T_L, k, T_R)    (l, k', c) = expose(T_L)if balance(|T_L|, |T_R|)return Node(T_L, k, T_R)else         T' = joinRightWB(c, k, T_R)        (l', k', r') = expose(T')if (balance(|l|,|T'|))return Node(l, k', T')else if (balance(|l|,|l'|) and balance(|l|+|l'|,|r'|))return rotateLeft(Node(l, k', T'))elsereturn rotateLeft(Node(l, k', rotateRight(T'))function joinLeftWB(T_L, k, T_R)    /* symmetric to joinRightWB */function join(T_L, k, T_R)if (heavy(T_L, T_R))return joinRightWB(T_L, k, T_R)if (heavy(T_R, T_L))return joinLeftWB(T_L, k, T_R)    Node(T_L, k, T_R)

Here balance $(x,y)$ means two weightsT andT are balanced. expose(v)=(l, k, r) means to extract a tree nodeT's left childT, the key of the nodeT and the right childT. Node(l, k, r) means to create a node of left childT, keyT and right childT.

The split algorithm is as follows:

function split(T, k)if (T = nil)return (nil, false, nil)    (L, (m, c), R) = expose(T)if (k = m)return (L, true, R)if (k < m)        (L', b, R') = split(L, k)return (L', b, join(R', m, R))if (k > m)        (L', b, R') = split(R, k)return (join(L, m, L'), b, R))

The union of two weight-balanced treest₁ andt₂ representing setsA andB, is a weight-balanced treet that representsA ∪B. The following recursive function computes this union:

function union(t₁, t₂):if t₁ = nil:return t₂if t₂ = nil:return t₁    t_<, t_> ← split t₂ on t₁.rootreturn join(union(left(t₁), t_<), t₁.root, union(right(t₁), t_>))

Here,Split is presumed to return two trees: one holding the keys less than its input key, the other holding the greater keys. (The algorithm isnon-destructive, but an in-place destructive version exists as well.)

The algorithm for intersection or difference is similar, but requires theJoin2 helper routine that is the same asJoin but without the middle key. Based on the new functions for union, intersection or difference, either one key or multiple keys can be inserted to or deleted from the weight-balanced tree. SinceSplit andUnion callJoin but do not deal with the balancing criteria of weight-balanced trees directly, such an implementation is usually called thejoin-based algorithms.

The complexity of each of union, intersection and difference is $O\left(m\log \left({n \over m}+1\right)\right)$ for two weight-balanced trees of sizesT and $n(\geq m)$ . This complexity is optimal in terms of the number of comparisons. More importantly, since the recursive calls to union, intersection or difference are independent of each other, they can be executedin parallel with aparallel depth $O(\log m\log n)$ .^[10] When $m=1$ , the join-based implementation has the same computationaldirected acyclic graph (DAG) as single-element insertion and deletion if the root of the larger tree is used to split the smaller tree.

Notes

[edit]

^This is the definition used by Nievergelt and Reingold. Adams uses the size as the weight directly,^[6] which complicates analysis of his variant and has led to bugs in major implementations.^[4]

References

[edit]

^Tsakalidis, A. K. (1984). "Maintaining order in a generalized linked list".Acta Informatica.21:101–112.doi:10.1007/BF00289142.S2CID 26127563.
^Nievergelt, J.; Reingold, E. M. (1973). "Binary Search Trees of Bounded Balance".SIAM Journal on Computing.2:33–43.doi:10.1137/0202005.
^ This article incorporatespublic domain material fromPaul E. Black."BB(α) tree".Dictionary of Algorithms and Data Structures.NIST.
^^a ^b ^cHirai, Y.; Yamamoto, K. (2011)."Balancing weight-balanced trees"(PDF).Journal of Functional Programming.21 (3): 287.doi:10.1017/S0956796811000104.
^Roura, Salvador (2001).A new method for balancing binary search trees.ICALP. Lecture Notes in Computer Science. Vol. 2076. pp. 469–480.doi:10.1007/3-540-48224-5_39.ISBN 978-3-540-42287-7.
^^a ^bAdams, Stephen (1993)."Functional Pearls: Efficient sets—a balancing act".Journal of Functional Programming.3 (4):553–561.doi:10.1017/S0956796800000885.
^^a ^b ^cBrass, Peter (2008).Advanced Data Structures. Cambridge University Press. pp. 61–71.
^Blum, Norbert; Mehlhorn, Kurt (1980)."On the average number of rebalancing operations in weight-balanced trees"(PDF).Theoretical Computer Science.11 (3):303–320.doi:10.1016/0304-3975(80)90018-3.
^Cho, Seonghun; Sahni, Sartaj (2000)."A New Weight Balanced Binary Search Tree".International Journal of Foundations of Computer Science.11 (3):485–513.CiteSeerX 10.1.1.36.3888.doi:10.1142/S0129054100000296.
^^a ^bBlelloch, Guy E.; Ferizovic, Daniel; Sun, Yihan (2016), "Just Join for Parallel Ordered Sets",Symposium on Parallel Algorithms and Architectures, Proc. of 28th ACM Symp. Parallel Algorithms and Architectures (SPAA 2016), ACM, pp. 253–264,arXiv:1602.02120,doi:10.1145/2935764.2935768,ISBN 978-1-4503-4210-0,S2CID 2897793.
^Adams, Stephen (1992),Implementing sets efficiently in a functional language,CiteSeerX 10.1.1.501.8427.

v t e Tree data structures
Search trees (dynamic sets, associative arrays)	2–3 2–3–4 AA (a,b) AVL B K-Dimensional B+ B* B^x Binary search Optimal Self-balancing Dancing HTree Interval Order statistic Palindrome (Left-leaning) Red–black Scapegoat Splay T Treap UB Weight-balanced
Heaps	Binary Binomial Brodal d-ary Fibonacci Leftist Pairing Skew binomial Skew van Emde Boas Weak
Tries	Ctrie C-trie (compressed ADT) Hash Radix Suffix Ternary search X-fast Y-fast
Spatial data partitioning trees	Ball BK BSP Cartesian Hilbert R k-d (implicitk-d) M Metric MVP Octree PH Priority R Quad R R+ R* Segment VP X
Other trees	Cover Exponential Fenwick Finger Fractal index Fusion Hash calendar iDistance K-ary Left-child right-sibling Link/cut Log-structured merge Merkle PQ Range SPQR Top

Retrieved from "https://en.wikipedia.org/w/index.php?title=Weight-balanced_tree&oldid=1298426436"

Category:

Search trees

Hidden categories:

[8]ページ先頭