Python Enhancement Proposals

Python »
PEP Index »
PEP 485

PEP 485 – A Function for testing approximate equality

Author:: Christopher Barker <PythonCHB at gmail.com>
Status:

Table of Contents

Abstract

This PEP proposes the addition of an isclose() function to the standardlibrary math module that determines whether one value is approximately equalor “close” to another value.

Floating point values contain limited precision, which results intheir being unable to exactly represent some values, and for errors toaccumulate with repeated computation. As a result, it is commonadvice to only use an equality comparison in very specific situations.Often an inequality comparison fits the bill, but there are times(often in testing) where the programmer wants to determine whether acomputed value is “close” to an expected value, without requiring themto be exactly equal. This is common enough, particularly in testing,and not always obvious how to do it, that it would be useful addition tothe standard library.

Existing Implementations

The standard library includes theunittest.TestCase.assertAlmostEqualmethod, but it:

Is buried in the unittest.TestCase class
Is an assertion, so you can’t use it as a general test at the commandline, etc. (easily)
Is an absolute difference test. Often the measure of differencerequires, particularly for floating point numbers, a relative error,i.e. “Are these two values within x% of each-other?”, rather than anabsolute error. Particularly when the magnitude of the values isunknown a priori.

The numpy package has theallclose() andisclose() functions,but they are only available with numpy.

The statistics package tests include an implementation, used for itsunit tests.

One can also find discussion and sample implementations on StackOverflow and other help sites.

Many other non-python systems provide such a test, including the Boost C++library and the APL language[4].

These existing implementations indicate that this is a common need andnot trivial to write oneself, making it a candidate for the standardlibrary.

Proposed Implementation

NOTE: this PEP is the result of extended discussions on thepython-ideas list[1].

The new function will go into the math module, and have the followingsignature:

isclose(a,b,rel_tol=1e-9,abs_tol=0.0)

a andb: are the two values to be tested to relative closeness

rel_tol: is the relative tolerance – it is the amount of errorallowed, relative to the larger absolute value of a or b. For example,to set a tolerance of 5%, pass tol=0.05. The default tolerance is 1e-9,which assures that the two values are the same within about 9 decimaldigits.rel_tol must be greater than 0.0

abs_tol: is a minimum absolute tolerance level – useful forcomparisons near zero.

Modulo error checking, etc, the function will return the result of:

abs(a-b)<=max(rel_tol*max(abs(a),abs(b)),abs_tol)

The name,isclose, is selected for consistency with the existingisnan andisinf.

Handling of non-finite numbers

The IEEE 754 special values of NaN, inf, and -inf will be handledaccording to IEEE rules. Specifically, NaN is not considered close toany other value, including NaN. inf and -inf are only considered closeto themselves.

Non-float types

The primary use-case is expected to be floating point numbers.However, users may want to compare other numeric types similarly. Intheory, it should work for any type that supportsabs(),multiplication, comparisons, and subtraction. However, the implementationin the math module is written in C, and thus can not (easily) use python’sduck typing. Rather, the values passed into the function will be convertedto the float type before the calculation is performed. Passing in types(or values) that cannot be converted to floats will raise an appropriateException (TypeError, ValueError, or OverflowError).

The code will be tested to accommodate at least some values of these types:

Decimal
int
Fraction
complex: For complex, a companion function will be added to thecmath module. Incmath.isclose(), the tolerances are specifiedas floats, and the absolute value of the complex valueswill be used for scaling and comparison. If a complex tolerance ispassed in, the absolute value will be used as the tolerance.

NOTE: it may make sense to add aDecimal.isclose() that works properly andcompletely with the decimal type, but that is not included as part of this PEP.

Behavior near zero

Relative comparison is problematic if either value is zero. Bydefinition, no value is small relative to zero. And computationally,if either value is zero, the difference is the absolute value of theother value, and the computed absolute tolerance will berel_toltimes that value. Whenrel_tol is less than one, the difference willnever be less than the tolerance.

However, while mathematically correct, there are many use cases wherea user will need to know if a computed value is “close” to zero. Thiscalls for an absolute tolerance test. If the user needs to call thisfunction inside a loop or comprehension, where some, but not all, ofthe expected values may be zero, it is important that both a relativetolerance and absolute tolerance can be tested for with a singlefunction with a single set of parameters.

There is a similar issue if the two values to be compared straddle zero:if a is approximately equal to -b, then a and b will never be computedas “close”.

To handle this case, an optional parameter,abs_tol can beused to set a minimum tolerance used in the case of very small or zerocomputed relative tolerance. That is, the values will be always beconsidered close if the difference between them is less thanabs_tol

The default absolute tolerance value is set to zero because there isno value that is appropriate for the general case. It is impossible toknow an appropriate value without knowing the likely values expectedfor a given use case. If all the values tested are on order of one,then a value of about 1e-9 might be appropriate, but that would be fartoo large if expected values are on order of 1e-9 or smaller.

Any non-zero default might result in user’s tests passing totallyinappropriately. If, on the other hand, a test against zero fails thefirst time with defaults, a user will be prompted to select anappropriate value for the problem at hand in order to get the test topass.

NOTE: that the author of this PEP has resolved to go back over many ofhis tests that use the numpyallclose() function, which providesa default absolute tolerance, and make sure that the default value isappropriate.

If the user sets the rel_tol parameter to 0.0, then only theabsolute tolerance will effect the result. While not the goal of thefunction, it does allow it to be used as a purely absolute tolerancecheck as well.

Implementation

A sample implementation in python is available (as of Jan 22, 2015) ongitHub:

https://github.com/PythonCHB/close_pep/blob/master/is_close.py

This implementation has a flag that lets the user select whichrelative tolerance test to apply – this PEP does not suggest thatthat be retained, but rather that the weak test be selected.

There are also drafts of this PEP and test code, etc. there:

https://github.com/PythonCHB/close_pep

Relative Difference

There are essentially two ways to think about how close two numbersare to each-other:

Absolute difference: simplyabs(a-b)

Relative difference:abs(a-b)/scale_factor[2].

The absolute difference is trivial enough that this proposal focuseson the relative difference.

Usually, the scale factor is some function of the values underconsideration, for instance:

The absolute value of one of the input values
The maximum absolute value of the two
The minimum absolute value of the two.
The absolute value of the arithmetic mean of the two

These lead to the following possibilities for determining if twovalues, a and b, are close to each other.

abs(a-b)<=tol*abs(a)
abs(a-b)<=tol*max(abs(a),abs(b))
abs(a-b)<=tol*min(abs(a),abs(b))
abs(a-b)<=tol*abs(a+b)/2

NOTE: (2) and (3) can also be written as:

(abs(a-b)<=abs(tol*a))or(abs(a-b)<=abs(tol*b))
(abs(a-b)<=abs(tol*a))and(abs(a-b)<=abs(tol*b))

(Boost refers to these as the “weak” and “strong” formulations[3])These can be a tiny bit more computationally efficient, and thus areused in the example code.

Each of these formulations can lead to slightly different results.However, if the tolerance value is small, the differences are quitesmall. In fact, often less than available floating point precision.

How much difference does it make?

When selecting a method to determine closeness, one might want to knowhow much of a difference it could make to use one test or the other– i.e. how many values are there (or what range of values) that willpass one test, but not the other.

The largest difference is between options (2) and (3) where theallowable absolute difference is scaled by either the larger orsmaller of the values.

Definedelta to be the difference between the allowable absolutetolerance defined by the larger value and that defined by the smallervalue. That is, the amount that the two input values need to bedifferent in order to get a different result from the two tests.tol is the relative tolerance value.

Assume thata is the larger value and that botha andbare positive, to make the analysis a bit easier.delta istherefore:

delta=tol*(a-b)

or:

delta/tol=(a-b)

The largest absolute difference that would pass the test:(a-b),equals the tolerance times the larger value:

(a-b)=tol*a

Substituting into the expression for delta:

delta/tol=tol*a

so:

delta=tol**2*a

For example, fora=10,b=9,tol=0.1 (10%):

maximum tolerancetol*a==0.1*10==1.0

minimum tolerancetol*b==0.1*9.0==0.9

delta =(1.0-0.9)=0.1 ortol**2*a=0.1**2*10=.1

The absolute difference between the maximum and minimum tolerancetests in this case could be substantial. However, the primary usecase for the proposed function is testing the results of computations.In that case a relative tolerance is likely to be selected of muchsmaller magnitude.

For example, a relative tolerance of1e-8 is about half theprecision available in a python float. In that case, the differencebetween the two tests is1e-8**2*a or1e-16*a, which isclose to the limit of precision of a python float. If the relativetolerance is set to the proposed default of 1e-9 (or smaller), thedifference between the two tests will be lost to the limits ofprecision of floating point. That is, each of the four methods willyield exactly the same results for all values of a and b.

In addition, in common use, tolerances are defined to 1 significantfigure – that is, 1e-9 is specifying about 9 decimal digits ofaccuracy. So the difference between the various possible tests is wellbelow the precision to which the tolerance is specified.

Symmetry

A relative comparison can be either symmetric or non-symmetric. For asymmetric algorithm:

isclose(a,b) is always the same asisclose(b,a)

If a relative closeness test uses only one of the values (such as (1)above), then the result is asymmetric, i.e. isclose(a,b) is notnecessarily the same as isclose(b,a).

Which approach is most appropriate depends on what question is beingasked. If the question is: “are these two numbers close to eachother?”, there is no obvious ordering, and a symmetric test is mostappropriate.

However, if the question is: “Is the computed value within x% of thisknown value?”, then it is appropriate to scale the tolerance to theknown value, and an asymmetric test is most appropriate.

From the previous section, it is clear that either approach wouldyield the same or similar results in the common use cases. In thatcase, the goal of this proposal is to provide a function that is leastlikely to produce surprising results.

The symmetric approach provide an appealing consistency – itmirrors the symmetry of equality, and is less likely to confusepeople. A symmetric test also relieves the user of the need to thinkabout the order in which to set the arguments. It was also pointedout that there may be some cases where the order of evaluation may notbe well defined, for instance in the case of comparing a set of valuesall against each other.

There may be cases when a user does need to know that a value iswithin a particular range of a known value. In that case, it is easyenough to simply write the test directly:

ifa-b<=tol*a:

(assuming a > b in this case). There is little need to provide afunction for this particular case.

This proposal uses a symmetric test.

Which symmetric test?

There are three symmetric tests considered:

The case that uses the arithmetic mean of the two values requires thatthe value be either added together before dividing by 2, which couldresult in extra overflow to inf for very large numbers, or requireeach value to be divided by two before being added together, whichcould result in underflow to zero for very small numbers. This effectwould only occur at the very limit of float values, but it was decidedthere was no benefit to the method worth reducing the range offunctionality or adding the complexity of checking values to determinethe order of computation.

This leaves the boost “weak” test (2)– or using the larger value toscale the tolerance, or the Boost “strong” (3) test, which uses thesmaller of the values to scale the tolerance. For small tolerance,they yield the same result, but this proposal uses the boost “weak”test case: it is symmetric and provides a more useful result for verylarge tolerances.

Large Tolerances

The most common use case is expected to be small tolerances – on order of thedefault 1e-9. However, there may be use cases where a user wants to know if twofairly disparate values are within a particular range of each other: “is awithin 200% (rel_tol = 2.0) of b? In this case, the strong test would neverindicate that two values are within that range of each other if one of them iszero. The weak case, however would use the larger (non-zero) value for thetest, and thus return true if one value is zero. For example: is 0 within 200%of 10? 200% of ten is 20, so the range within 200% of ten is -10 to +30. Zerofalls within that range, so it will return True.

Defaults

Default values are required for the relative and absolute tolerance.

Relative Tolerance Default

The relative tolerance required for two values to be considered“close” is entirely use-case dependent. Nevertheless, the relativetolerance needs to be greater than 1e-16 (approximate precision of apython float). The value of 1e-9 was selected because it is thelargest relative tolerance for which the various possible methods willyield the same result, and it is also about half of the precisionavailable to a python float. In the general case, a good numericalalgorithm is not expected to lose more than about half of availabledigits of accuracy, and if a much larger tolerance is acceptable, theuser should be considering the proper value in that case. Thus 1e-9 isexpected to “just work” for many cases.

Absolute tolerance default

The absolute tolerance value will be used primarily for comparing tozero. The absolute tolerance required to determine if a value is“close” to zero is entirely use-case dependent. There is alsoessentially no bounds to the useful range – expected values wouldconceivably be anywhere within the limits of a python float. Thus adefault of 0.0 is selected.

If, for a given use case, a user needs to compare to zero, the testwill be guaranteed to fail the first time, and the user can select anappropriate value.

It was suggested that comparing to zero is, in fact, a common use case(evidence suggest that the numpy functions are often used with zero).In this case, it would be desirable to have a “useful” default. Valuesaround 1e-8 were suggested, being about half of floating pointprecision for values of around value 1.

However, to quote The Zen: “In the face of ambiguity, refuse thetemptation to guess.” Guessing that users will most often be concernedwith values close to 1.0 would lead to spurious passing tests when usedwith smaller values – this is potentially more damaging thanrequiring the user to thoughtfully select an appropriate value.

Expected Uses

The primary expected use case is various forms of testing – “are theresults computed near what I expect as a result?” This sort of testmay or may not be part of a formal unit testing suite. Such testingcould be used one-off at the command line, in an IPython notebook,part of doctests, or simple asserts in anif__name__=="__main__"block.

It would also be an appropriate function to use for the terminationcriteria for a simple iterative solution to an implicit function:

guess=somethingwhileTrue:new_guess=implicit_function(guess,*args)ifisclose(new_guess,guess):breakguess=new_guess

Inappropriate uses

One use case for floating point comparison is testing the accuracy ofa numerical algorithm. However, in this case, the numerical analystideally would be doing careful error propagation analysis, and shouldunderstand exactly what to test for. It is also likely that ULP (Unitin the Last Place) comparison may be called for. While this functionmay prove useful in such situations, It is not intended to be used inthat way without careful consideration.

Other Approaches

`unittest.TestCase.assertAlmostEqual`

(https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertAlmostEqual)

Tests that values are approximately (or not approximately) equal bycomputing the difference, rounding to the given number of decimalplaces (default 7), and comparing to zero.

This method is purely an absolute tolerance test, and does not addressthe need for a relative tolerance test.

numpy`isclose()`

http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.isclose.html

The numpy package provides the vectorized functions isclose() andallclose(), for similar use cases as this proposal:

isclose(a,b,rtol=1e-05,atol=1e-08,equal_nan=False)

Returns a boolean array where two arrays are element-wise equalwithin a tolerance.
The tolerance values are positive, typically very small numbers.The relative difference (rtol * abs(b)) and the absolutedifference atol are added together to compare against theabsolute difference between a and b

In this approach, the absolute and relative tolerance are addedtogether, rather than theor method used in this proposal. This iscomputationally more simple, and if relative tolerance is larger thanthe absolute tolerance, then the addition will have no effect. However,if the absolute and relative tolerances are of similar magnitude, thenthe allowed difference will be about twice as large as expected.

This makes the function harder to understand, with no computationaladvantage in this context.

Even more critically, if the values passed in are small compared tothe absolute tolerance, then the relative tolerance will becompletely swamped, perhaps unexpectedly.

This is why, in this proposal, the absolute tolerance defaults to zero– the user will be required to choose a value appropriate for thevalues at hand.

Boost floating-point comparison

The Boost project ([3] ) provides a floating point comparisonfunction. It is a symmetric approach, with both “weak” (larger of thetwo relative errors) and “strong” (smaller of the two relative errors)options. This proposal uses the Boost “weak” approach. There is noneed to complicate the API by providing the option to select differentmethods when the results will be similar in most cases, and the useris unlikely to know which to select in any case.

Alternate Proposals

A Recipe

The primary alternate proposal was to not provide a standard libraryfunction at all, but rather, provide a recipe for users to refer to.This would have the advantage that the recipe could provide andexplain the various options, and let the user select that which ismost appropriate. However, that would require anyone needing such atest to, at the very least, copy the function into their code base,and select the comparison method to use.

`zero_tol`

One possibility was to provide a zero tolerance parameter, rather thanthe absolute tolerance parameter. This would be an absolute tolerancethat would only be applied in the case of one of the arguments beingexactly zero. This would have the advantage of retaining the fullrelative tolerance behavior for all non-zero values, while allowingtests against zero to work. However, it would also result in thepotentially surprising result that a small value could be “close” tozero, but not “close” to an even smaller value. e.g., 1e-10 is “close”to zero, but not “close” to 1e-11.

No absolute tolerance

Given the issues with comparing to zero, another possibility wouldhave been to only provide a relative tolerance, and let comparison tozero fail. In this case, the user would need to do a simple absolutetest:abs(val)<zero_tol in the case where the comparison involvedzero.

However, this would not allow the same call to be used for a sequenceof values, such as in a loop or comprehension. Making the function farless useful. It is noted that the default abs_tol=0.0 achieves thesame effect if the default is not overridden.