Movatterモバイル変換

Flashsort

From Wikipedia, the free encyclopedia

O(n) sorting algorithm

Flashsort is adistribution sorting algorithm showinglinear computational complexityO(n) for uniformly distributed data sets and relatively little additional memory requirement. The original work was published in 1998 by Karl-Dietrich Neubert.^[1]

Concept

[edit]

Flashsort is an efficientin-place implementation ofhistogram sort, itself a type ofbucket sort. It assigns each of then input elements to one ofmbuckets, efficiently rearranges the input to place the buckets in the correct order, then sorts each bucket. The original algorithm sorts an input arrayA as follows:

Using a first pass over the input ora priori knowledge, find the minimum and maximum sort keys.
Linearly divide the range[A_min,A_max] intom buckets.
Make one pass over the input, counting the number of elementsA_i which fall into each bucket. (Neubert calls the buckets "classes" and the assignment of elements to their buckets "classification".)
Convert the counts of elements in each bucket to aprefix sum, whereL_b is the number of elementsA_i in bucketb or less. (L₀ = 0 andL_m =n.)
Rearrange the input so all elements of each bucketb are stored in positionsA_i whereL_b−1 <i ≤L_b.
Sort each bucket usinginsertion sort.

Steps 1–3 and 6 are common to any bucket sort, and can be improved using techniques generic to bucket sorts. In particular, the goal is for the buckets to be of approximately equal size (n/m elements each),^[1] with the ideal being division intomquantiles. While the basic algorithm is a linearinterpolation sort, if the inputdistribution is known to be non-uniform, a non-linear division will more closely approximate this ideal. Likewise, the final sort can use any of a number of techniques, including a recursive flash sort.

What distinguishes flash sort is step 5: an efficientO(n) in-place algorithm for collecting the elements of each bucket together in the correct relative order using onlym words of additional memory.

Memory efficient implementation

[edit]

The Flashsort rearrangement phase operates incycles. Elements start out "unclassified", then are moved to the correct bucket and considered "classified". The basic procedure is to choose an unclassified element, find its correct bucket, exchange it with an unclassified element there (which must exist, because we counted the size of each bucket ahead of time), mark it as classified, and then repeat with the just-exchanged unclassified element. Eventually, the element is exchanged with itself and the cycle ends.

The details are easy to understand using two (word-sized) variables per bucket. The clever part is the elimination of one of those variables, allowing twice as many buckets to be used and therefore half as much time spent on the finalO(n²) sorting.

To understand it with two variables per bucket, assume there are two arrays ofm additional words:K_b is the (fixed) upper limit of bucketb (andK₀ = 0), whileL_b is a (movable) index into bucketb, soK_b−1 ≤L_b ≤K_b.

We maintain theloop invariant that each bucket is divided byL_b into an unclassified prefix (A_i forK_b−1 <i ≤L_b have yet to be moved to their target buckets) and a classified suffix (A_i forL_b <i ≤K_b are all in the correct bucket and will not be moved again). InitiallyL_b =K_b and all elements are unclassified. As sorting proceeds, theL_b are decremented untilL_b =K_b−1 for allb and all elements are classified into the correct bucket.

Each round begins by finding the first incompletely classified bucketc (which hasK_c−1 <L_c) and taking the first unclassified element in that bucketA_i wherei =K_c−1 + 1. (Neubert calls this the "cycle leader".) CopyA_i to a temporary variablet and repeat:

Compute the bucketb to whicht belongs.
Letj =L_b be the location wheret will be stored.
Exchanget withA_j, i.e. storet inA_j while fetching the previous valueA_j thereby displaced.
DecrementL_b to reflect the fact thatA_j is now correctly classified.
Ifj ≠i, restart this loop with the newt.
Ifj =i, this round is over and find a new first unclassified elementA_i.
When there are no more unclassified elements, the distribution into buckets is complete.

When implemented with two variables per bucket in this way, the choice of each round's starting pointi is in fact arbitrary;any unclassified element may be used as a cycle leader. The only requirement is that the cycle leaders can be found efficiently.

Although the preceding description usesK to find the cycle leaders, it is in fact possible to do without it, allowing the entirem-word array to be eliminated. (After the distribution is complete, the bucket boundaries can be found inL.)

Suppose that we have classified all elements up toi−1, and are consideringA_i as a potential new cycle leader. It is easy to compute its target bucketb. By the loop invariant, it is classified ifL_b <i ≤K_b, and unclassified ifi is outside that range. The firstinequality is easy to test, but the second appears to require the valueK_b.

It turns out that theinduction hypothesis that all elements up toi−1 are classified implies thati ≤K_b, so it is not necessary to test the second inequality.

Consider the bucketc which positioni falls into. That is,K_c−1 <i ≤K_c. By the induction hypothesis, all elements belowi, which includes all buckets up toK_c−1 <i, are completely classified. I.e. no elements which belong in those buckets remain in the rest of the array. Therefore, it is not possible thatb <c.

The only remaining case isb ≥c, which impliesK_b ≥K_c ≥i,Q.E.D.

Incorporating this, the flashsort distribution algorithm begins withL as described above andi = 1. Then proceed:^[1]^[2]

Ifi >n, the distribution is complete.
GivenA_i, compute the bucketb to which it belongs.
Ifi ≤L_b, thenA_i is unclassified. Copy it a temporary variablet and:
- Letj =L_b be the location wheret will be stored.
- Exchanget withA_j, i.e. storet inA_j while fetching the previous valueA_j thereby displaced.
- DecrementL_b to reflect the fact thatA_j is now correctly classified.
- Ifj ≠i, compute the bucketb to whicht belongs and restart this (inner) loop with the newt.
A_i is now correctly classified. Incrementi and restart the (outer) loop.

While saving memory, Flashsort has the disadvantage that it recomputes the bucket for many already-classified elements. This is already done twice per element (once during the bucket-counting phase and a second time when moving each element), but searching for the first unclassified element requires a third computation for most elements. This could be expensive if buckets are assigned using a more complex formula than simple linear interpolation. A variant reduces the number of computations from almost3n to at most2n + m − 1 by taking thelast unclassified element in an unfinished bucket as cycle leader:

Maintain a variablec identifying the first incompletely-classified bucket. Letc = 1 to begin with, and whenc >m, the distribution is complete.
Leti =L_c. Ifi =L_c−1, incrementc and restart this loop. (L₀ = 0.)
Compute the bucketb to whichA_i belongs.
Ifb <c, thenL_c =K_c−1 and we are done with bucketc. Incrementc and restart this loop.
Ifb =c, the classification is trivial. DecrementL_c and restart this loop.
Ifb >c, thenA_i is unclassified. Perform the same classification loop as the previous case, then restart this loop.

Most elements have their buckets computed only twice, except for the final element in each bucket, which is used to detect the completion of the following bucket. A small further reduction can be achieved by maintaining a count of unclassified elements and stopping when it reaches zero.

Performance

[edit]

The only extra memory requirements are the auxiliary vectorL for storing bucket bounds and the constant number of other variables used. Further, each element is moved (via a temporary buffer, so two move operations) only once. However, this memory efficiency comes with the disadvantage that the array is accessed randomly, so cannot take advantage of adata cache smaller than the whole array.

As with all bucket sorts, performance depends critically on the balance of the buckets. In the ideal case of a balanced data set, each bucket will be approximately the same size. If the numberm of buckets is linear in the input sizen, each bucket has a constant size, so sorting a single bucket with anO(n²) algorithm like insertion sort has complexityO(1²) =O(1). The running time of the final insertion sorts is thereforem ⋅ O(1) =O(m) =O(n).

Choosing a value form, the number of buckets, trades off time spent classifying elements (highm) and time spent in the final insertion sort step (lowm). For example, ifm is chosen proportional to√n, then the running time of the final insertion sorts is thereforem ⋅ O(√n²) =O(n^3/2).

In the worst-case scenarios where almost all the elements are in a few buckets, the complexity of the algorithm is limited by the performance of the final bucket-sorting method, so degrades toO(n²). Variations of the algorithm improve worst-case performance by using better-performing sorts such asquicksort or recursive flashsort on buckets which exceed a certain size limit.^[2]^[3]

Form = 0.1n with uniformly distributed random data, flashsort is faster thanheapsort for alln and faster than quicksort forn > 80. It becomes about twice as fast as quicksort atn = 10000.^[1] Note that these measurements were taken in the late 1990s, whenmemory hierarchies were much less dependent on cacheing.

Due to thein situ permutation that flashsort performs in its classification process, flashsort is notstable. If stability is required, it is possible to use a second array so elements can be classified sequentially. However, in this case, the algorithm will requireO(n) additional memory.

References

[edit]

^^a ^b ^c ^dNeubert, Karl-Dietrich (February 1998)."The Flashsort1 Algorithm".Dr. Dobb's Journal.23 (2):123–125, 131. Retrieved2007-11-06.
^^a ^bNeubert, Karl-Dietrich (1998)."The FlashSort Algorithm". Retrieved2007-11-06.
^Xiao, Li; Zhang, Xiaodong; Kubricht, Stefan A. (2000)."Improving Memory Performance of Sorting Algorithms: Cache-Effective Quicksort".ACM Journal of Experimental Algorithmics.5.CiteSeerX 10.1.1.43.736.doi:10.1145/351827.384245. Archived fromthe original on 2007-11-02. Retrieved2007-11-06.

External links

[edit]

v t e Sorting algorithms
Theory	Computational complexity theory Big O notation Total order Lists Inplacement Stability Comparison sort Adaptive sort Sorting network Integer sorting X + Y sorting Transdichotomous model Quantum sort
Exchange sorts	Bubble sort Cocktail shaker sort Odd–even sort Comb sort Gnome sort Proportion extend sort Quicksort
Selection sorts	Selection sort Heapsort Smoothsort Cartesian tree sort Tournament sort Cycle sort Weak-heap sort
Insertion sorts	Insertion sort Shellsort Splaysort Tree sort Library sort Patience sorting
Merge sorts	Merge sort Cascade merge sort Oscillating merge sort Polyphase merge sort
Distribution sorts	American flag sort Bead sort Bucket sort Burstsort Counting sort Interpolation sort Pigeonhole sort Proxmap sort Radix sort Flashsort
Concurrent sorts	Bitonic sorter Batcher odd–even mergesort Pairwise sorting network Samplesort
Hybrid sorts	Block merge sort Introsort Kirkpatrick–Reisch sort Merge-insertion sort Powersort Timsort Spreadsort
Other	Topological sorting Pre-topological order Pancake sorting Spaghetti sort
Impractical sorts	Stooge sort Slowsort Bogosort

Retrieved from "https://en.wikipedia.org/w/index.php?title=Flashsort&oldid=1275141043"

Category:

Sorting algorithms

Hidden categories:

[8]ページ先頭

Movatterモバイル変換

Concept

Memory efficient implementation

Performance

See also

References

External links