Range tree | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Type | tree | |||||||||||||||||
Invented | 1979 | |||||||||||||||||
Invented by | Jon Louis Bentley | |||||||||||||||||
|
Incomputer science, arange tree is anordered treedata structure to hold a list of points. It allows all points within a given range to bereported efficiently, and is typically used in two or higher dimensions. Range trees were introduced byJon Louis Bentley in 1979.[1] Similar data structures were discovered independently by Lueker,[2] Lee and Wong,[3] and Willard.[4]The range tree is an alternative to thek-d tree. Compared tok-d trees, range trees offer faster query times of (inBig O notation) but worse storage of, wheren is the number of points stored in the tree,d is the dimension of each point andk is the number of points reported by a given query.
In 1990,Bernard Chazelle improved this to query time and space complexity.[5][6]
A range tree on a set of 1-dimensional points is a balancedbinary search tree on those points. The points stored in the tree are stored in the leaves of the tree; each internal node stores the largest value of its left subtree.A range tree on a set of points ind-dimensions is arecursively defined multi-levelbinary search tree. Each level of the data structure is a binary search tree on one of thed-dimensions.The first level is a binary search tree on the first of thed-coordinates. Each vertexv of this tree contains an associated structure that is a (d−1)-dimensional range tree on the last (d−1)-coordinates of the points stored in the subtree ofv.
A 1-dimensional range tree on a set ofn points is a binary search tree, which can be constructed in time. Range trees in higher dimensions are constructed recursively by constructing a balanced binary search tree on the first coordinate of the points, and then, for each vertexv in this tree, constructing a (d−1)-dimensional range tree on the points contained in the subtree ofv. Constructing a range tree this way would require time.
This construction time can be improved for 2-dimensional range trees to.[7]LetS be a set ofn 2-dimensional points. IfS contains only one point, return a leaf containing that point. Otherwise, construct the associated structure ofS, a 1-dimensional range tree on they-coordinates of the points inS. Letxm be the medianx-coordinate of the points. LetSL be the set of points withx-coordinate less than or equal toxm and letSR be the set of points withx-coordinate greater thanxm. Recursively constructvL, a 2-dimensional range tree onSL, andvR, a 2-dimensional range tree onSR. Create a vertexv with left-childvL and right-childvR.If we sort the points by theiry-coordinates at the start of the algorithm, and maintain this ordering when splitting the points by theirx-coordinate, we can construct the associated structures of each subtree in linear time.This reduces the time to construct a 2-dimensional range tree to, and also reduces the time to construct ad-dimensional range tree to.
Arange query on a range tree reports the set of points that lie inside a given interval. To report the points that lie in the interval [x1,x2], we start by searching forx1 andx2. At some vertex in the tree, the search paths tox1 andx2 will diverge. Letvsplit be the last vertex that these two search paths have in common. For every vertexv in the search path fromvsplit tox1, if the value stored atv is greater thanx1, report every point in the right-subtree ofv. Ifv is a leaf, report the value stored atv if it is inside the query interval. Similarly, reporting all of the points stored in the left-subtrees of the vertices with values less thanx2 along the search path fromvsplit tox2, and report the leaf of this path if it lies within the query interval.
Since the range tree is a balanced binary tree, the search paths tox1 andx2 have length. Reporting all of the points stored in the subtree of a vertex can be done in linear time using anytree traversal algorithm. It follows that the time to perform a range query is, wherek is the number of points in the query interval.
Range queries ind-dimensions are similar. Instead of reporting all of the points stored in the subtrees of the search paths, perform a (d−1)-dimensional range query on the associated structure of each subtree. Eventually, a 1-dimensional range query will be performed and the correct points will be reported. Since ad-dimensional query consists of (d−1)-dimensional range queries, it follows that the time required to perform ad-dimensional range query is, wherek is the number of points in the query interval. This can be reduced to using a variant offractional cascading.[2][4][7]