Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Ziggurat algorithm

From Wikipedia, the free encyclopedia
Algorithm for pseudo-random number sampling

Theziggurat algorithm is analgorithm forpseudo-random number sampling. Belonging to the class ofrejection sampling algorithms, it relies on an underlying source of uniformly-distributed random numbers, typically from apseudo-random number generator, as well as precomputed tables. The algorithm is used to generate values from amonotonically decreasingprobability distribution. It can also be applied tosymmetricunimodal distributions, such as thenormal distribution, by choosing a value from one half of the distribution and then randomly choosing which half the value is considered to have been drawn from. It was developed byGeorge Marsaglia and others in the 1960s.

A typical value produced by the algorithm only requires the generation of one random floating-point value and one random table index, followed by one table lookup, one multiply operation and one comparison. Sometimes (2.5% of the time, in the case of a normal orexponential distribution when using typical table sizes)[citation needed] more computations are required. Nevertheless, the algorithm is computationally much faster[citation needed] than the two most commonly used methods of generating normally distributed random numbers, theMarsaglia polar method and theBox–Muller transform, which require at least one logarithm and onesquare root calculation for each pair of generated values. However, since the ziggurat algorithm is more complex to implement it is best used when large quantities of random numbers are required.

The termziggurat algorithm dates from Marsaglia's paper with Wai Wan Tsang in 2000; it is so named because it is conceptually based on covering the probability distribution with rectangular segments stacked in decreasing order of size, resulting in a figure that resembles aziggurat.

The Ziggurat algorithm used to generate sample values with anormal distribution. (Only positive values are shown for simplicity.) The pink dots are initially uniform-distributed random numbers. The desired distribution function is first segmented into equal areas "A". One layeri is selected at random by the uniform source at the left. Then a random value from the top source is multiplied by the width of the chosen layer, and the result isx tested to see which region of the layer it falls into with 3 possible outcomes: 1) (left, solid black region) the sample is clearly under the curve and may be output immediately, 2) (right, vertically striped region) the sample value may lie under the curve, and must be tested further. In that case, a randomy value within the chosen layer is generated and compared tof(x). If less, the point is under the curve and the valuex is output. If not, (the third case), the chosen pointx is rejected and the algorithm is restarted from the beginning.

Theory of operation

[edit]

The ziggurat algorithm is a rejection sampling algorithm; it randomly generates a point in a distribution slightly larger than the desired distribution, then tests whether the generated point is inside the desired distribution. If not, it tries again. Given a random point underneath a probability density curve, itsx coordinate is a random number with the desired distribution.

The distribution the ziggurat algorithm chooses from is made up ofn equal-area regions;n1{\displaystyle n-1} rectangles that cover the bulk of the desired distribution, on top of a non-rectangular base that includes the tail of the distribution.

Given a monotone decreasing probability density functionf(x){\displaystyle f(x)}, defined for allx0{\displaystyle x\geq 0} , the base of the ziggurat is defined as all points inside the distribution and belowy1=f(x1){\displaystyle y_{1}=f(x_{1})}. This consists of a rectangular region from(0,0){\displaystyle (0,0)} to(x1,y1){\displaystyle (x_{1},y_{1})}, and the (typically infinite) tail of the distribution, wherexx1(and yy1){\displaystyle x\geq x_{1}({\text{and }}y\leq y_{1})}.

This layer (call it layer 0) has areaA. On top of this, add a rectangular layer of widthx1{\displaystyle x_{1}} and heightA/x1{\displaystyle A/x_{1}}, so it also has areaA{\displaystyle A}. The top of this layer is at heighty2=y1+A/x1{\displaystyle y_{2}=y_{1}+A/x_{1}} and intersects the density function at a point(x2,y2){\displaystyle (x_{2},y_{2})}, wherey2=f(x2){\displaystyle y_{2}=f(x_{2})}. This layer includes every point in the density function betweeny1{\displaystyle y_{1}} andy2{\displaystyle y_{2}}, but (unlike the base layer) also includes points such as(x1,y2){\displaystyle (x_{1},y_{2})} which are not in the desired distribution.

Further layers are then stacked on top. To use a precomputed table of sizen{\displaystyle n} (n = 256 is typical), one choosesx1{\displaystyle x_{1}}such thatxn=0{\displaystyle x_{n}=0}, meaning that the top box, layern1{\displaystyle n-1}, reaches the distribution's peak at(0,f(0)){\displaystyle (0,f(0))} exactly.

Layeri{\displaystyle i} extends vertically in range[yi,yi+1]{\displaystyle [y_{i},y_{i+1}]}, and can be divided into two regions horizontally: the (generally larger) portion in range[0,xi+1]{\displaystyle [0,x_{i+1}]} which is entirely contained within the desired distribution, and the (small) portion in range[xi+1,x]{\displaystyle [x_{i+1},x]}, which is only partially contained.

Ignoring for a moment the problem of layer 0, and given uniform random variablesU0{\displaystyle U_{0}} andU1[0,1){\displaystyle U_{1}\in [0,1)}, the ziggurat algorithm can be described as:

  1. Choose a random layer0i<n{\displaystyle 0\leq i<n}.
  2. Letx=U0xi{\displaystyle x=U_{0}x_{i}}.
  3. Ifx<xi+1{\displaystyle x<x_{i+1}}, returnx{\displaystyle x}.
  4. Lety=yi+U1(yi+1yi){\displaystyle y=y_{i}+U_{1}(y_{i+1}-y_{i})}.
  5. Computef(x){\displaystyle f(x)}. Ify<f(x){\displaystyle y<f(x)}, returnx{\displaystyle x}.
  6. Otherwise, choose new random numbers and go back to step 1.

Step 1 amounts to choosing a low-resolutiony coordinate. Step 3 tests if thex coordinate is clearly within the desired density function without knowing more about the y coordinate. If it is not, step 4 chooses a high-resolution y coordinate, and step 5 does the rejection test.

With closely spaced layers, the algorithm terminates at step 3 a very large fraction of the time. For the top layern1{\displaystyle n-1}, however, this test always fails, becausexn=0{\displaystyle x_{n}=0}.

Layer 0 can also be divided into a central region and an edge, but the edge is an infinite tail. To use the same algorithm to check if the point is in the central region, generate a fictitiousx0=A/y1{\displaystyle x_{0}=A/y_{1}}. This will generate points withx<x1{\displaystyle x<x_{1}} with the correct frequency, and in the rare case that layer 0 is selected andxx1{\displaystyle x\geq x_{1}}, use a special fallback algorithm to select a point at random from the tail. Because the fallback algorithm is used less than one time in a thousand, speed is not essential.

Thus, the full ziggurat algorithm for one-sided distributions is:

  1. Choose a random layer0i<n{\displaystyle 0\leq i<n}.
  2. Letx=U0xi{\displaystyle x=U_{0}x_{i}}.
  3. Ifx<xi+1{\displaystyle x<x_{i+1}}, returnx{\displaystyle x}.
  4. Ifi=0{\displaystyle i=0}, generate a point from the tail using the fallback algorithm.
  5. Lety=yi+U1(yi+1yi){\displaystyle y=y_{i}+U_{1}(y_{i+1}-y_{i})}.
  6. Computef(x){\displaystyle f(x)}. Ify<f(x){\displaystyle y<f(x)}, returnx{\displaystyle x}.
  7. Otherwise, choose new random numbers and go back to step 1.

For a two-sided distribution, the result must be negated 50% of the time. This can often be done conveniently by choosingU0[1,1]{\displaystyle U_{0}\in [-1,1]} and, in step 3, testing ifx∣<xi+1{\displaystyle \mid x\mid <x_{i+1}}.

Fallback algorithms for the tail

[edit]

Because the ziggurat algorithm only generatesmost outputs very rapidly, and requires a fallback algorithm wheneverx>x1{\displaystyle x>x_{1}} it is always more complex than a more direct implementation. The specific fallback algorithm depends on the distribution.

For an exponential distribution, the tail looks just like the body of the distribution. One way is to fall back to the most elementary algorithmE = −ln(U1) and letx = x1 − ln(U1). Another is to call the ziggurat algorithmrecursively and addx1 to the result.

For a normal distribution, Marsaglia suggests a compact algorithm:

  1. Letx = −ln(U1)/x1.
  2. Lety = −ln(U2).
  3. If 2y >x2, returnx + x1.
  4. Otherwise, go back to step 1.

Sincex1 ≈ 3.5 for typical table sizes, the test in step 3 is almost always successful. Since −ln(U1) is an exponentially distributed variate, an implementation of the exponential distribution may be used.

Optimizations

[edit]

The algorithm can be performed efficiently with precomputed tables ofxi andyi =f(xi), but there are some modifications to make it even faster:

  • Nothing in the ziggurat algorithm depends on the probability distribution function being normalized (integral under the curve equal to 1), removingnormalizing constants can speed up the computation off(x).
  • Most uniform random number generators are based on integer random number generators which return an integer in the range [0,  232 − 1]. A table of 2−32xi lets you use such numbers directly forU0.
  • When computing two-sided distributions using a two-sidedU0 as described earlier, the random integer can be interpreted as a signed number in the range [−231, 231 − 1], and a scale factor of 2−31 can be used.
  • Rather than comparingU0xi toxi +1 in step 3, it is possible to precomputexi +1/xi and compareU0 with that directly. IfU0 is an integer random number generator, these limits may be premultiplied by 232 (or 231, as appropriate) so an integer comparison can be used.
  • With the above two changes, the table of unmodifiedxi values is no longer needed and may be deleted.
  • When generatingIEEE 754 single-precisionfloating point values, which only have a 24-bit mantissa (including the implicit leading 1), the least-significant bits of a 32-bit integer random number are not used. These bits may be used to select the layer number. (See the references below for a detailed discussion of this.)
  • The first three steps may be put into aninline function, which can call an out-of-line implementation of the less frequently needed steps.

Generating the tables

[edit]

It is possible to store the entire table precomputed, or just include the valuesn,y1,A, and an implementation off −1(y) in thesource code, and compute the remaining values when initializing the random number generator.

As previously described, you can findxi =f −1(yi) andyi +1yi + A/xi. Repeatn − 1 times for the layers of the ziggurat. At the end, you should haveyn = f(0). There will be someround-off error, but it is a usefulsanity test to see that it is acceptably small.

When actually filling in the table values, just assume thatxn = 0 andyn = f(0), and accept the slight difference in layern − 1's area as rounding error.

Findingx1 andA

[edit]

Given an initial (guess at)x1, you need a way to compute the areat of the tail for whichx >x1. For the exponential distribution, this is justex1, while for the normal distribution, assuming you are using the unnormalizedf(x) =ex2/2, this isπ/2erfc(x/2). For more awkward distributions,numerical integration may be required.

With this in hand, fromx1, you can findy1 =f(x1), the areat in the tail, and the area of the base layerA =x1y1 + t.

Then compute the seriesyi andxi as above. Ifyi >f(0) for anyi <n, then the initial estimatex1 was too low, leading to too large an areaA. Ifyn <f(0), then the initial estimatex1 was too high.

Given this, use aroot-finding algorithm (such as thebisection method) to find the valuex1 which producesyn−1 as close tof(0) as possible. Alternatively, look for the value which makes the area of the topmost layer,xn−1(f(0) − yn−1), as close to the desired valueA as possible. This saves one evaluation off −1(x) and is actually the condition of greatest interest.

McFarland's variation

[edit]

Christopher D. McFarland has proposed a further-optimized version.[1] This applies three algorithmic changes, at the expense of slightly larger tables.

First, the common case considers only the rectangular portions, from (0,yi −1) to (xiyi) The odd-shaped regions to the right of these (mostly almost triangular, plus the tail) are handled separately. This simplifies and speeds up the algorithm'sfast path.

Second, the exact area of the odd-shaped regions is used; they arenot rounded up to include the entire rectangle to (xi −1yi). This increases the probability that the fast path will be used.

One major consequence of this is that the number of layers is slightly less thann. Even though the area of the odd-shaped portions is taken exactly, the total adds up to more than one layer's worth. The area per layer is adjusted so that the number of rectangular layers is an integer. If the initial 0 ≤ i < n exceeds the number of rectangular layers, phase 2 proceeds.

If the value sought lies in any of the odd-shaped regions, thealias method is used to choose one, based on its true area. This is a small amount of additional work, and requires additional alias tables, but chooses one of the layers' right-hand sides.

The chosen odd-shaped region is rejection sampled, but if a sample is rejected, the algorithm doesnot return to the beginning. The true area of each odd-shaped region was used to choose a layer, so the rejection-sampling loop stays in that layer until a point is chosen.

Third, the almost-triangular shape of most odd-shaped portions is exploited, although this must be divided into three cases depending on thesecond derivative of the probability distribution function in the selected layer.

If the function is convex (as the exponential distribution is everywhere, and the normal distribution is for |x| > 1), then the function is strictly contained within the lower triangle. Two unit uniform deviatesU1 andU2 are chosen, and before they are scaled to the rectangle enclosing the odd-shaped region, their sum is tested. IfU1 + U2 > 1, the point is in the upper triangle and can be reflected to (1−U1, 1−U2). Then, ifU1 + U2 < 1−ε, for some suitable toleranceε, the point is definitely below the curve and can immediately be accepted. Only for points very close to the diagonal is it necessary to compute the distribution functionf(x) to perform an exact rejection test. (The toleranceε should in theory depend on the layer, but a single maximum value can be used on all layers with little loss.)

If the function is concave (as the normal distribution is for |x| < 1), it includes a small portion of the upper triangle so reflection is impossible, but points whose normalized coordinates satisfyU1 +U2 ≤ 1 can be immediately accepted, and points for whichU1 +U2 > 1+ε can be immediately rejected.

In the one layer which straddles |x| = 1, the normal distribution has aninflection point, and the exact rejection test must be applied if 1−ε <U1 + U2 < 1+ε.

The tail is handled as in the original Ziggurat algorithm, and can be thought of as a fourth case for the shape of the odd-shaped region to the right.

References

[edit]
  1. ^McFarland, Christopher D. (24 June 2015)."A modified ziggurat algorithm for generating exponentially and normally distributed pseudorandom numbers".Journal of Statistical Computation and Simulation.86 (7):1281–1294.arXiv:1403.6870.doi:10.1080/00949655.2015.1060234.PMC 4812161.PMID 27041780. Note that theBitbucket repository mentioned in the paper is no longer available and the code is now athttps://github.com/cd-mcfarland/fast_prng
Retrieved from "https://en.wikipedia.org/w/index.php?title=Ziggurat_algorithm&oldid=1322516018"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp