Movatterモバイル変換

[0]ホーム

Jump to content

Floating-point error mitigation

Add links

From Wikipedia, the free encyclopedia

(Redirected fromFloating point error mitigation)

Strategies to make sure approximate calculations stay close to accurate

Floating-point error mitigation is the minimization of errors caused by the fact that real numbers cannot, in general, be accurately represented in a fixed space. By definition,floating-point error cannot be eliminated, and, at best, can only be managed.

Huberto M. Sierra noted in his 1956 patent "Floating Decimal Point Arithmetic Control Means for Calculator":^[1]

Thus under some conditions, the major portion of the significant data digits may lie beyond the capacity of the registers. Therefore, the result obtained may have little meaning if not totally erroneous.

TheZ1, developed byKonrad Zuse in 1936, was the first computer withfloating-point arithmetic and was thus susceptible to floating-point error. Early computers, however, with operation times measured in milliseconds, could not solve large, complex problems^[2] and thus were seldom plagued with floating-point error. Today, however, withsupercomputer system performance measured inpetaflops, floating-point error is a major concern for computational problem solvers.

The following sections describe the strengths and weaknesses of various means of mitigating floating-point error.

Numerical error analysis

[edit]

Though not the primary focus ofnumerical analysis,^[3]^[4]^: 5numerical error analysis exists for the analysis and minimization of floating-point rounding error.

Monte Carlo arithmetic

[edit]

Error analysis byMonte Carlo arithmetic is accomplished by repeatedly injecting small errors into an algorithm's data values and determining the relative effect on the results.

Extension of precision

[edit]

Extension of precision is using of larger representations of real values than the one initially considered. TheIEEE 754 standard defines precision as the number of digits available to represent real numbers. A programming language can includesingle precision (32 bits),double precision (64 bits), andquadruple precision (128 bits). While extension of precision makes the effects of error less likely or less important, the true accuracy of the results is still unknown.

Variable-length arithmetic

[edit]

Variable length arithmetic represents numbers as a string of digits of a variable's length limited only by the memory available. Variable-length arithmetic operations are considerably slower than fixed-length format floating-point instructions. When high performance is not a requirement, but high precision is, variable length arithmetic can prove useful, though the actual accuracy of the result may not be known.

Use of the error term of a floating-point operation

[edit]

The floating-point algorithm known asTwoSum^[5] or2Sum, due to Knuth and Møller, and its simpler, but restricted versionFastTwoSum orFast2Sum (3 operations instead of 6), allow one to get the (exact) error term of a floating-point addition rounded to nearest. One can also obtain the (exact) error term of a floating-point multiplication rounded to nearest in 2 operations with afused multiply–add (FMA), or 17 operations if the FMA is not available (with an algorithm due to Dekker). These error terms can be used in algorithms in order to improve the accuracy of the final result, e.g. withfloating-point expansions orcompensated algorithms.

Operations giving the result of a floating-point addition or multiplication rounded to nearest with its error term (but slightly differing from algorithms mentioned above) have been standardized and recommended in the IEEE 754-2019 standard.

Choice of a different radix

[edit]

Changing theradix, in particular from binary to decimal, can help to reduce the error and better control the rounding in some applications, such asfinancial applications.

Interval arithmetic

[edit]

Interval arithmetic is a mathematical technique used to put bounds onrounding errors andmeasurement errors inmathematical computation. Values are intervals, which can be represented in various ways, such as:^[6]

inf-sup: a lower bound and an upper bound on the true value;
mid-rad: an approximation and an error bound (calledmidpoint andradius of the interval);
triplex: an approximation, a lower bound and an upper bound on the error.

"Instead of using a single floating-point number as approximation for the value of a real variable in the mathematical model under investigation, interval arithmetic acknowledges limited precision by associating with the variable a set of reals as possible values. For ease of storage and computation, these sets are restricted to intervals."^[7]

The evaluation of interval arithmetic expression may provide a large range of values,^[7] and may seriously overestimate the true error boundaries.^[8]^: 8

Gustafson's unums

[edit]

Unums ("Universal Numbers") are an extension of variable length arithmetic proposed byJohn Gustafson.^[9] Unums have variable length fields for the exponent andsignificand lengths and error information is carried in a single bit, the ubit, representing possible error in the least significant bit of the significand (ULP).^[9]^: 4

The efficacy of unums is questioned byWilliam Kahan.^[8]

Bounded floating point

[edit]

Bounded floating point is a method proposed and patented by Alan Jorgensen.^[10] The data structure includes the standardIEEE 754 data structure and interpretation, as well as information about the error between the true real value represented and the value stored by the floating point representation.^[11]

Bounded floating point has been criticized as being derivative of Gustafson's work on unums and interval arithmetic.^[10]^[12]

References

[edit]

^"Floating decimal point arithmetic control means for calculator: United States Patent 3037701".FreePatentsOnline.com. 1962-06-05. Retrieved2022-01-21.
^"History of Computer Development & Generation of Computer".WikiEducator. September 2014. Retrieved2018-02-17.
^Trefethen, Lloyd N. (1992)."The Definition of Numerical Analysis"(PDF). SIAM. Retrieved2018-02-16.
^Higham, Nicholas John (2002).Accuracy and Stability of Numerical Algorithms (2 ed.).Society for Industrial and Applied Mathematics (SIAM).ISBN 978-0-89871-521-7.
^Richard Shewchuk, Jonathan (October 1997)."Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates"(PDF).Discrete & Computational Geometry.18 (3):305–363.doi:10.1007/PL00009321.S2CID 189937041. Retrieved2022-11-14.
^"IEEE Standard for Interval Arithmetic".IEEE STD 1788-2015:1–97. 2015-06-30.doi:10.1109/IEEESTD.2015.7140721.ISBN 978-0-7381-9720-3.
^^a ^bHickey, T.; Ju, Q.; van Emden, M.H. (September 2001)."Interval Arithmetic: from Principles to Implementation"(PDF).Journal of the ACM.48 (5):1038–1068.CiteSeerX 10.1.1.43.8001.doi:10.1145/502102.502106.S2CID 15105694. Retrieved2018-02-16.
^^a ^bKahan, William (July 2016)."A Critique of John L. Gustafson's THE END of ERROR — Unum Computation and his A Radical Approach to Computation with Real Numbers"(PDF). Retrieved2018-02-17.
^^a ^bGustafson, John Leroy (2016-02-04) [2015-02-05].The End of Error: Unum Computing.Chapman & Hall /CRC Computational Science. Vol. 24 (2nd corrected printing, 1st ed.).CRC Press.ISBN 978-1-4822-3986-7. Retrieved2016-05-30.[1][2]
^^a ^bTrader, Tiffany (2018-01-17)."Inventor Claims to Have Solved Floating Point Error Problem".HPCwire. Retrieved2022-03-01.
^US patent 11023230B2, Jorgensen, Alan A., "Apparatus for Calculating and Retaining a Bound on Error during Floating Point Operations and Methods Thereof", issued 2021-06-01
^"Has the Decades-Old Floating Point Error Problem been Solved?".insideHPC. 2018-01-17. Retrieved2022-03-01.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Floating-point_error_mitigation&oldid=1307795463"

Categories:

Hidden categories:

[8]ページ先頭