Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Garsia–Wachs algorithm

From Wikipedia, the free encyclopedia

TheGarsia–Wachs algorithm is an efficient method for computers to constructoptimal binary search trees andalphabetic Huffman codes, inlinearithmic time. It is named afterAdriano Garsia andMichelle L. Wachs.

Problem description

[edit]

The input to the problem, for an integern{\displaystyle n}, consists of a sequence ofn+1{\displaystyle n+1} non-negative weightsw0,w1,,wn{\displaystyle w_{0},w_{1},\dots ,w_{n}}. The output is a rootedbinary tree withn{\displaystyle n} internal nodes, each having exactly two children. Such a tree has exactlyn+1{\displaystyle n+1} leaf nodes, which can be identified (in the order given by the binary tree) with then+1{\displaystyle n+1} input weights.The goal of the problem is to find a tree, among all of the possible trees withn{\displaystyle n} internal nodes, that minimizes the weighted sum of theexternal path lengths. These path lengths are the numbers of steps from the root to each leaf. They are multiplied by the weight of the leaf and then summed to give the quality of the overall tree.[1]

This problem can be interpreted as a problem of constructing abinary search treeforn{\displaystyle n} ordered keys, with the assumption that the tree will be used only to search for values that are not already in the tree. In this case, then{\displaystyle n} keys partition the space of search values inton+1{\displaystyle n+1} intervals, and the weight of one of these intervals can be taken as the probability of searching for a value that lands in that interval. The weighted sum of external path lengths controls theexpected time for searching the tree.[1]

Alternatively, the output of the problem can be used as aHuffman code, a method for encodingn+1{\displaystyle n+1} given values unambiguously by using variable-length sequences ofbinary values. In this interpretation, the code for a value is given by the sequence of left and right steps from a parent to the child on the path from the root to a leaf in the tree (e.g. with 0 for left and 1 for right). Unlike standard Huffman codes, the ones constructed in this way arealphabetical, meaning that the sorted order of these binary codes is the same as the input ordering of the values. If the weight of a value is its frequency in a message to be encoded, then the output of the Garsia–Wachs algorithm is the alphabetical Huffman code thatcompresses the message to the shortest possible length.[1]

Algorithm

[edit]
The binary tree constructed in the first phase of the algorithm by finding and merging out-of-order triples in the input sequence (left) and the output of the algorithm, a correctly-ordered binary tree whose leaves are at the same levels as the other tree

Overall, the algorithm consists of three phases:[1]

  1. Build a binary tree having the values as leaves but possibly in the wrong order.
  2. Compute each leaf's distance from the root in the resulting tree.
  3. Build another binary tree with the leaves at the same distances but in the correct order.

The first phase of the algorithm is easier to describe if the input is augmented with twosentinel values,{\displaystyle \infty } (or any sufficiently large finite value) at the start and end of the sequence.[2]

The first phase maintains a forest of trees, initially a single-node tree for each non-sentinel input weight, which will eventually become the binary tree that it constructs. Each tree is associated with a value, the sum of the weights of its leavesmakes a tree node for each non-sentinel input weight. The algorithm maintains a sequence of these values, with the two sentinel values at each end. The initial sequence is just the order in which the leaf weights were given as input.It then repeatedly performs the following steps, each of which reduces the length of the input sequence, until there is only one tree containing all the leaves:[1]

To implement this phase efficiently, the algorithm can maintain its current sequence of values in anyself-balancing binary search tree structure. Such a structure allows the removal ofx{\displaystyle x} andy{\displaystyle y}, and the reinsertion of their new parent, in logarithmic time. In each step, the weights up toy{\displaystyle y} in the even positions of the array form a decreasing sequence, and the weights in the odd positions form another decreasing sequence. Therefore, the position to reinsertx+y{\displaystyle x+y} may be found in logarithmic time by using the balanced tree to perform twobinary searches, one for each of these two decreasing sequences. The search for the first position for whichxz{\displaystyle x\leq z} can be performed in linear total time by using asequential search that begins at thez{\displaystyle z} from the previous triple.[1]

It is nontrivial to prove that, in the third phase of the algorithm, another tree with the same distances exists and that this tree provides the optimal solution to the problem. But assuming this to be true, the second and third phases of the algorithm are straightforward to implement in linear time. Therefore, the total time for the algorithm, on an input of lengthn{\displaystyle n}, isO(nlogn){\displaystyle O(n\log n)}.

History

[edit]

The Garsia–Wachs algorithm is named afterAdriano Garsia andMichelle L. Wachs, who published it in 1977.[1][3] Their algorithm simplified an earlier method ofT. C. Hu andAlan Tucker,[1][4] and (although it is different in internal details) it ends up making the same comparisons in the same order as the Hu–Tucker algorithm.[5] The original proof of correctness of the Garsia–Wachs algorithm was complicated, and was later simplified byKingston (1988)[1][2]andKarpinski, Larmore & Rytter (1997).[6]

References

[edit]
  1. ^abcdefghiKnuth, Donald E. (1998), "Algorithm G (Garsia–Wachs algorithm for optimum binary trees)",The Art of Computer Programming, Vol. 3: Sorting and Searching (2nd ed.), Addison–Wesley, pp. 451–453. See also History and bibliography, pp. 453–454.
  2. ^abKingston, Jeffrey H. (1988), "A new proof of the Garsia–Wachs algorithm",Journal of Algorithms,9 (1):129–136,doi:10.1016/0196-6774(88)90009-0,MR 0925602
  3. ^Garsia, Adriano M.;Wachs, Michelle L. (1977), "A new algorithm for minimum cost binary trees",SIAM Journal on Computing,6 (4):622–642,doi:10.1137/0206045,MR 0520738
  4. ^Hu, T. C.;Tucker, A. C. (1971), "Optimal computer search trees and variable-length alphabetical codes",SIAM Journal on Applied Mathematics,21 (4):514–532,doi:10.1137/0121057,MR 0304063
  5. ^Mehlhorn, Kurt; Tsagarakis, Marcos (1979), "On the isomorphism of two algorithms: Hu/Tucker and Garsia/Wachs",Les arbres en algèbre et en programmation (4ème Colloq., Lille, 1979), Univ. Lille I, Lille, pp. 159–172,MR 0554347
  6. ^Karpinski, Marek;Larmore, Lawrence L.;Rytter, Wojciech (1997), "Correctness of constructing optimal alphabetic trees revisited",Theoretical Computer Science,180 (1–2):309–324,doi:10.1016/S0304-3975(96)00296-4,MR 1453872

Further reading

[edit]
  • Filliatre, Jean-Christophe (2008), "A functional implementation of the Garsia–Wachs algorithm (functional pearl)",Proceedings of the 2008 ACM SIGPLAN Workshop on ML (ML '08), New York, NY, USA: Association for Computing Machinery, pp. 91–96,doi:10.1145/1411304.1411317,ISBN 978-1-60558-062-3,S2CID 1541092
Retrieved from "https://en.wikipedia.org/w/index.php?title=Garsia–Wachs_algorithm&oldid=1187704784"
Categories:

[8]ページ先頭

©2009-2025 Movatter.jp