NumPy 1.17.0 Release Notes#
This NumPy release contains a number of new features that should substantiallyimprove its performance and usefulness, see Highlights below for a summary. ThePython versions supported are 3.5-3.7, note that Python 2.7 has been dropped.Python 3.8b2 should work with the released source packages, but there are nofuture guarantees.
Downstream developers should use Cython >= 0.29.11 for Python 3.8 support andOpenBLAS >= 3.7 (not currently out) to avoid problems on the Skylakearchitecture. The NumPy wheels on PyPI are built from the OpenBLAS developmentbranch in order to avoid those problems.
Highlights#
A new extensible
randommodule along with four selectablerandom numbergenerators and improved seeding designed for use in parallelprocesses has been added. The currently available bit generators areMT19937,PCG64,Philox, andSFC64. See below underNew Features.NumPy’s
FFTimplementation was changed from fftpack to pocketfft,resulting in faster, more accurate transforms and better handling of datasetsof prime length. See below under Improvements.New radix sort and timsort sorting methods. It is currently not possible tochoose which will be used. They are hardwired to the datatype and usedwhen either
stableormergesortis passed as the method. See belowunder Improvements.Overriding numpy functions is now possible by default,see
__array_function__below.
New functions#
numpy.errstateis now also a function decorator
Deprecations#
numpy.polynomial functions warn when passedfloat in place ofint#
Previously functions in this module would acceptfloat values provided theywere integral (1.0,2.0, etc). For consistency with the rest of numpy,doing so is now deprecated, and in future will raise aTypeError.
Similarly, passing a float like0.5 in place of an integer will now raise aTypeError instead of the previousValueError.
Deprecatenumpy.distutils.exec_command andtemp_file_name#
The internal use of these functions has been refactored and there are betteralternatives. Replaceexec_command withsubprocess.Popen andtemp_file_name withtempfile.mkstemp.
Writeable flag of C-API wrapped arrays#
When an array is created from the C-API to wrap a pointer to data, the onlyindication we have of the read-write nature of the data is thewriteableflag set during creation. It is dangerous to force the flag to writeable.In the future it will not be possible to switch the writeable flag toTruefrom python.This deprecation should not affect many users since arrays created in sucha manner are very rare in practice and only available through the NumPy C-API.
numpy.nonzero should no longer be called on 0d arrays#
The behavior ofnumpy.nonzero on 0d arrays was surprising, making uses of italmost always incorrect. If the old behavior was intended, it can be preservedwithout a warning by usingnonzero(atleast_1d(arr)) instead ofnonzero(arr). In a future release, it is most likely this will raise aValueError.
Writing to the result ofnumpy.broadcast_arrays will warn#
Commonlynumpy.broadcast_arrays returns a writeable array with internaloverlap, making it unsafe to write to. A future version will set thewriteable flag toFalse, and require users to manually set it toTrue if they are sure that is what they want to do. Now writing to it willemit a deprecation warning with instructions to set thewriteable flagTrue. Note that if one were to inspect the flag before setting it, onewould find it would already beTrue. Explicitly setting it, though, as onewill need to do in future versions, clears an internal flag that is used toproduce the deprecation warning. To help alleviate confusion, an additionalFutureWarning will be emitted when accessing thewriteable flag state toclarify the contradiction.
Note that for the C-side buffer protocol such an array will return areadonly buffer immediately unless a writable buffer is requested. Ifa writeable buffer is requested a warning will be given. When usingcython, theconst qualifier should be used with such arrays to avoidthe warning (e.g.cdefconstdouble[::1]view).
Future Changes#
Shape-1 fields in dtypes won’t be collapsed to scalars in a future version#
Currently, a field specified as[(name,dtype,1)] or"1type" isinterpreted as a scalar field (i.e., the same as[(name,dtype)] or[(name,dtype,()]). This now raises a FutureWarning; in a future version,it will be interpreted as a shape-(1,) field, i.e. the same as[(name,dtype,(1,))] or"(1,)type" (consistently with[(name,dtype,n)]/"ntype" withn>1, which is already equivalent to[(name,dtype,(n,)] /"(n,)type").
Compatibility notes#
float16 subnormal rounding#
Casting from a different floating point precision tofloat16 used incorrectrounding in some edge cases. This means in rare cases, subnormal results willnow be rounded up instead of down, changing the last bit (ULP) of the result.
Signed zero when using divmod#
Starting in version1.12.0, numpy incorrectly returned a negatively signed zerowhen using thedivmod andfloor_divide functions when the result waszero. For example:
>>>np.zeros(10)//1array([-0., -0., -0., -0., -0., -0., -0., -0., -0., -0.])
With this release, the result is correctly returned as a positively signedzero:
>>>np.zeros(10)//1array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
MaskedArray.mask now returns a view of the mask, not the mask itself#
Returning the mask itself was unsafe, as it could be reshaped in place whichwould violate expectations of the masked array code. The behavior ofmask is now consistent withdata,which also returns a view.
The underlying mask can still be accessed with._mask if it is needed.Tests that containassertx.maskisnoty.mask or similar will need to beupdated.
Do not lookup__buffer__ attribute innumpy.frombuffer#
Looking up__buffer__ attribute innumpy.frombuffer was undocumented andnon-functional. This code was removed. If needed, usefrombuffer(memoryview(obj),...) instead.
out is buffered for memory overlaps intake,choose,put#
If the out argument to these functions is provided and has memory overlap withthe other arguments, it is now buffered to avoid order-dependent behavior.
Unpickling while loading requires explicit opt-in#
The functionsload, andlib.format.read_array take anallow_pickle keyword which now defaults toFalse in response toCVE-2019-6446.
Potential changes to the random stream in old random module#
Due to bugs in the application oflog to random floating point numbers,the stream may change when sampling frombeta,binomial,laplace,logistic,logseries ormultinomial if a0 is generated in the underlyingMT19937random stream. There is a1 in\(10^{53}\) chance of this occurring, so the probability that the streamchanges for any given seed is extremely small. If a0 is encountered in theunderlying generator, then the incorrect value produced (eithernumpy.inf ornumpy.nan) is now dropped.
i0 now always returns a result with the same shape as the input#
Previously, the output was squeezed, such that, e.g., input with just a singleelement would lead to an array scalar being returned, and inputs with shapessuch as(10,1) would yield results that would not broadcast against theinput.
Note that we generally recommend the SciPy implementation over the numpy one:it is a proper ufunc written in C, and more than an order of magnitude faster.
can_cast no longer assumes all unsafe casting is allowed#
Previously,can_cast returnedTrue for almost all inputs forcasting='unsafe', even for cases where casting was not possible, such asfrom a structured dtype to a regular one. This has been fixed, making itmore consistent with actual casting using, e.g., the.astypemethod.
ndarray.flags.writeable can be switched to true slightly more often#
In rare cases, it was not possible to switch an array from not writeableto writeable, although a base array is writeable. This can happen if anintermediatendarray.base object is writeable. Previously, only the deepestbase object was considered for this decision. However, in rare cases thisobject does not have the necessary information. In that case switching towriteable was never allowed. This has now been fixed.
C API changes#
dimension or stride input arguments are now passed bynpy_intpconst*#
Previously these function arguments were declared as the more strictnpy_intp*, which prevented the caller passing constant data.This change is backwards compatible, but now allows code like:
npy_intpconstfixed_dims[]={1,2,3};//nolongercomplainsthattheconst-qualifierisdiscardednpy_intpsize=PyArray_MultiplyList(fixed_dims,3);
New Features#
New extensiblenumpy.random module with selectable random number generators#
A new extensiblenumpy.random module along with four selectable random numbergenerators and improved seeding designed for use in parallel processes has beenadded. The currently availableBit Generators areMT19937,PCG64,Philox, andSFC64.PCG64 is the new default whileMT19937 is retained for backwardscompatibility. Note that the legacy random module is unchanged and is nowfrozen, your current results will not change. More information is available intheAPI change description and in thetop-levelview documentation.
libFLAME#
Support for building NumPy with the libFLAME linear algebra package as the LAPACK,implementation, seelibFLAME for details.
User-defined BLAS detection order#
distutils now uses an environment variable, comma-separated and caseinsensitive, to determine the detection order for BLAS libraries.By defaultNPY_BLAS_ORDER=mkl,blis,openblas,atlas,accelerate,blas.However, to force the use of OpenBLAS simply do:
NPY_BLAS_ORDER=openblaspythonsetup.pybuild
which forces the use of OpenBLAS.This may be helpful for users which have a MKL installation but wishes to tryout different implementations.
User-defined LAPACK detection order#
numpy.distutils now uses an environment variable, comma-separated and caseinsensitive, to determine the detection order for LAPACK libraries.By defaultNPY_LAPACK_ORDER=mkl,openblas,flame,atlas,accelerate,lapack.However, to force the use of OpenBLAS simply do:
NPY_LAPACK_ORDER=openblaspythonsetup.pybuild
which forces the use of OpenBLAS.This may be helpful for users which have a MKL installation but wishes to tryout different implementations.
ufunc.reduce and related functions now accept awhere mask#
ufunc.reduce,sum,prod,min,max allnow accept awhere keyword argument, which can be used to tell whichelements to include in the reduction. For reductions that do not have anidentity, it is necessary to also pass in an initial value (e.g.,initial=np.inf formin). For instance, the equivalent ofnansum would benp.sum(a,where=~np.isnan(a)).
Timsort and radix sort have replaced mergesort for stable sorting#
Both radix sort and timsort have been implemented and are now used in place ofmergesort. Due to the need to maintain backward compatibility, the sortingkind options"stable" and"mergesort" have been made aliases ofeach other with the actual sort implementation depending on the array type.Radix sort is used for small integer types of 16 bits or less and timsort forthe remaining types. Timsort features improved performance on data containingalready or nearly sorted data and performs like mergesort on random data andrequires\(O(n/2)\) working space. Details of the timsort algorithm can befound atCPython listsort.txt.
packbits andunpackbits accept anorder keyword#
Theorder keyword defaults tobig, and will order thebitsaccordingly. For'order=big' 3 will become[0,0,0,0,0,0,1,1],and[1,1,0,0,0,0,0,0] fororder=little
unpackbits now accepts acount parameter#
count allows subsetting the number of bits that will be unpacked up-front,rather than reshaping and subsetting later, making thepackbits operationinvertible, and the unpacking less wasteful. Counts larger than the number ofavailable bits add zero padding. Negative counts trim bits off the end insteadof counting from the beginning. None counts implement the existing behavior ofunpacking everything.
linalg.svd andlinalg.pinv can be faster on hermitian inputs#
These functions now accept ahermitian argument, matching the one addedtolinalg.matrix_rank in 1.14.0.
divmod operation is now supported for twotimedelta64 operands#
The divmod operator now handles twotimedelta64 operands, withtype signaturemm->qm.
fromfile now takes anoffset argument#
This function now takes anoffset keyword argument for binary files,which specifics the offset (in bytes) from the file’s current position.Defaults to0.
New mode “empty” forpad#
This mode pads an array to a desired shape without initializing the newentries.
empty_like and related functions now accept ashape argument#
empty_like,full_like,ones_like andzeros_like now accept ashapekeyword argument, which can be used to create a new arrayas the prototype, overriding its shape as well. This is particularly usefulwhen combined with the__array_function__ protocol, allowing the creationof new arbitrary-shape arrays from NumPy-like libraries when such an arrayis used as the prototype.
Floating point scalars implementas_integer_ratio to match the builtin float#
This returns a (numerator, denominator) pair, which can be used to construct afractions.Fraction.
Structureddtype objects can be indexed with multiple fields names#
arr.dtype[['a','b']] now returns a dtype that is equivalent toarr[['a','b']].dtype, for consistency witharr.dtype['a']==arr['a'].dtype.
Like the dtype of structured arrays indexed with a list of fields, this dtypehas the sameitemsize as the original, but only keeps a subset of the fields.
This means thatarr[['a','b']] andarr.view(arr.dtype[['a','b']]) areequivalent.
.npy files support unicode field names#
A new format version of 3.0 has been introduced, which enables structured typeswith non-latin1 field names. This is used automatically when needed.
Improvements#
Array comparison assertions include maximum differences#
Error messages from array comparison tests such astesting.assert_allclose now include “max absolute difference” and“max relative difference,” in addition to the previous “mismatch” percentage.This information makes it easier to update absolute and relative errortolerances.
Replacement of the fftpack basedfft module by the pocketfft library#
Both implementations have the same ancestor (Fortran77 FFTPACK by Paul N.Swarztrauber), but pocketfft contains additional modifications which improveboth accuracy and performance in some circumstances. For FFT lengths containinglarge prime factors, pocketfft uses Bluestein’s algorithm, which maintains\(O(N log N)\) run time complexity instead of deteriorating towards\(O(N*N)\) for prime lengths. Also, accuracy for real valued FFTs with nearprime lengths has improved and is on par with complex valued FFTs.
Further improvements toctypes support innumpy.ctypeslib#
A newnumpy.ctypeslib.as_ctypes_type function has been added, which can beused to converts adtype into a best-guessctypes type. Thanks to thisnew function,numpy.ctypeslib.as_ctypes now supports a much wider range ofarray types, including structures, booleans, and integers of non-nativeendianness.
numpy.errstate is now also a function decorator#
Currently, if you have a function like:
deffoo():pass
and you want to wrap the whole thing inerrstate, you have to rewrite itlike so:
deffoo():withnp.errstate(...):pass
but with this change, you can do:
@np.errstate(...)deffoo():pass
thereby saving a level of indentation
numpy.exp andnumpy.log speed up for float32 implementation#
float32 implementation ofexp andlog now benefit from AVX2/AVX512instruction set which are detected during runtime.exp has a max ulperror of 2.52 andlog has a max ulp error or 3.83.
Improve performance ofnumpy.pad#
The performance of the function has been improved for most cases by filling ina preallocated array with the desired padded shape instead of usingconcatenation.
numpy.interp handles infinities more robustly#
In some cases whereinterp would previously returnnan, it nowreturns an appropriate infinity.
Pathlib support forfromfile,tofile andndarray.dump#
fromfile,ndarray.ndarray.tofile andndarray.dump now supportthepathlib.Path type for thefile/fid parameter.
Specializedisnan,isinf, andisfinite ufuncs for bool and int types#
The boolean and integer types are incapable of storingnan andinf values,which allows us to provide specialized ufuncs that are up to 250x faster thanthe previous approach.
isfinite supportsdatetime64 andtimedelta64 types#
Previously,isfinite used to raise aTypeError on being used on thesetwo types.
New keywords added tonan_to_num#
nan_to_num now accepts keywordsnan,posinf andneginfallowing the user to define the value to replace thenan, positive andnegativenp.inf values respectively.
MemoryErrors caused by allocated overly large arrays are more descriptive#
Often the cause of a MemoryError is incorrect broadcasting, which results in avery large and incorrect shape. The message of the error now includes thisshape to help diagnose the cause of failure.
floor,ceil, andtrunc now respect builtin magic methods#
These ufuncs now call the__floor__,__ceil__, and__trunc__methods when called on object arrays, making them compatible withdecimal.Decimal andfractions.Fraction objects.
quantile now works onfraction.Fraction anddecimal.Decimal objects#
In general, this handles object arrays more gracefully, and avoids floating-point operations if exact arithmetic types are used.
Support of object arrays inmatmul#
It is now possible to usematmul (or the@ operator) with object arrays.For instance, it is now possible to do:
fromfractionsimportFractiona=np.array([[Fraction(1,2),Fraction(1,3)],[Fraction(1,3),Fraction(1,2)]])b=a@a
Changes#
median andpercentile family of functions no longer warn aboutnan#
numpy.median,numpy.percentile, andnumpy.quantile used to emit aRuntimeWarning when encountering annan. Since they return thenan value, the warning is redundant and has been removed.
timedelta64%0 behavior adjusted to returnNaT#
The modulus operation with twonp.timedelta64 operands now returnsNaT in the case of division by zero, rather than returning zero
NumPy functions now always support overrides with__array_function__#
NumPy now always checks the__array_function__ method to implement overridesof NumPy functions on non-NumPy arrays, as described inNEP 18. The featurewas available for testing with NumPy 1.16 if appropriate environment variablesare set, but is now always enabled.
lib.recfunctions.structured_to_unstructured does not squeeze single-field views#
Previouslystructured_to_unstructured(arr[['a']]) would produce a squeezedresult inconsistent withstructured_to_unstructured(arr[['a',b']]). Thiswas accidental. The old behavior can be retained withstructured_to_unstructured(arr[['a']]).squeeze(axis=-1) or far more simply,arr['a'].
clip now uses a ufunc under the hood#
This means that registering clip functions for custom dtypes in C viadescr->f->fastclip is deprecated - they should use the ufunc registrationmechanism instead, attaching to thenp.core.umath.clip ufunc.
It also means thatclip acceptswhere andcasting arguments,and can be override with__array_ufunc__.
A consequence of this change is that some behaviors of the oldclip havebeen deprecated:
Passing
nanto mean “do not clip” as one or both bounds. This didn’t workin all cases anyway, and can be better handled by passing infinities of theappropriate sign.Using “unsafe” casting by default when an
outargument is passed. Usingcasting="unsafe"explicitly will silence this warning.
Additionally, there are some corner cases with behavior changes:
Padding
max<minhas changed to be more consistent across dtypes, butshould not be relied upon.Scalar
minandmaxtake part in promotion rules like they do in allother ufuncs.
__array_interface__ offset now works as documented#
The interface may use anoffset value that was mistakenly ignored.
Pickle protocol insavez set to 3 forforcezip64 flag#
savez was not using theforce_zip64 flag, which limited the size ofthe archive to 2GB. But using the flag requires us to use pickle protocol 3 towriteobject arrays. The protocol used was bumped to 3, meaning the archivewill be unreadable by Python2.
Structured arrays indexed with non-existent fields raiseKeyError notValueError#
arr['bad_field'] on a structured type raisesKeyError, for consistencywithdict['bad_field'].