Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 410 – Use decimal.Decimal type for timestamps

Author:
Victor Stinner <vstinner at python.org>
Status:
Rejected
Type:
Standards Track
Created:
01-Feb-2012
Python-Version:
3.3
Resolution:
Python-Dev message

Table of Contents

Rejection Notice

This PEP is rejected.Seehttps://mail.python.org/pipermail/python-dev/2012-February/116837.html.

Abstract

Decimal becomes the official type for high-resolution timestamps to make Pythonsupport new functions using a nanosecond resolution without loss of precision.

Rationale

Python 2.3 introduced float timestamps to support sub-second resolutions.os.stat() uses float timestamps by default since Python 2.5. Python 3.3introduced functions supporting nanosecond resolutions:

  • os module: futimens(), utimensat()
  • time module: clock_gettime(), clock_getres(), monotonic(), wallclock()

os.stat() reads nanosecond timestamps but returns timestamps as float.

The Python float type uses binary64 format of the IEEE 754 standard. With aresolution of one nanosecond (10-9), float timestamps lose precisionfor values bigger than 224 seconds (194 days: 1970-07-14 for an Epochtimestamp).

Nanosecond resolution is required to set the exact modification time onfilesystems supporting nanosecond timestamps (e.g. ext4, btrfs, NTFS, …). Ithelps also to compare the modification time to check if a file is newer thananother file. Use cases: copy the modification time of a file usingshutil.copystat(), create a TAR archive with the tarfile module, manage amailbox with the mailbox module, etc.

An arbitrary resolution is preferred over a fixed resolution (like nanosecond)to not have to change the API when a better resolution is required. Forexample, the NTP protocol uses fractions of 232 seconds(approximately 2.3 × 10-10 second), whereas the NTP protocol version4 uses fractions of 264 seconds (5.4 × 10-20 second).

Note

With a resolution of 1 microsecond (10-6), float timestamps loseprecision for values bigger than 233 seconds (272 years: 2242-03-16for an Epoch timestamp). With a resolution of 100 nanoseconds(10-7, resolution used on Windows), float timestamps lose precisionfor values bigger than 229 seconds (17 years: 1987-01-05 for anEpoch timestamp).

Specification

Add decimal.Decimal as a new type for timestamps. Decimal supports anytimestamp resolution, support arithmetic operations and is comparable. It ispossible to coerce a Decimal to float, even if the conversion may loseprecision. The clock resolution can also be stored in a Decimal object.

Add an optionaltimestamp argument to:

  • os module: fstat(), fstatat(), lstat(), stat() (st_atime,st_ctime and st_mtime fields of the stat structure),sched_rr_get_interval(), times(), wait3() and wait4()
  • resource module: ru_utime and ru_stime fields of getrusage()
  • signal module: getitimer(), setitimer()
  • time module: clock(), clock_gettime(), clock_getres(),monotonic(), time() and wallclock()

Thetimestamp argument value can be float or Decimal, float is still thedefault for backward compatibility. The following functions support Decimal asinput:

  • datetime module: date.fromtimestamp(), datetime.fromtimestamp() anddatetime.utcfromtimestamp()
  • os module: futimes(), futimesat(), lutimes(), utime()
  • select module: epoll.poll(), kqueue.control(), select()
  • signal module: setitimer(), sigtimedwait()
  • time module: ctime(), gmtime(), localtime(), sleep()

The os.stat_float_times() function is deprecated: use an explicit cast usingint() instead.

Note

The decimal module is implemented in Python and is slower than float, butthere is a new C implementation which is almost ready for inclusion inCPython.

Backwards Compatibility

The default timestamp type (float) is unchanged, so there is no impact onbackward compatibility nor on performances. The new timestamp type,decimal.Decimal, is only returned when requested explicitly.

Objection: clocks accuracy

Computer clocks and operating systems are inaccurate and fail to providenanosecond accuracy in practice. A nanosecond is what it takes to execute acouple of CPU instructions. Even on a real-time operating system, ananosecond-precise measurement is already obsolete when it starts beingprocessed by the higher-level application. A single cache miss in the CPU willmake the precision worthless.

Note

Linuxactually is able to measure time in nanosecond precision, eventhough it is not able to keep its clock synchronized to UTC with ananosecond accuracy.

Alternatives: Timestamp types

To support timestamps with an arbitrary or nanosecond resolution, the followingtypes have been considered:

  • decimal.Decimal
  • number of nanoseconds
  • 128-bits float
  • datetime.datetime
  • datetime.timedelta
  • tuple of integers
  • timespec structure

Criteria:

  • Doing arithmetic on timestamps must be possible
  • Timestamps must be comparable
  • An arbitrary resolution, or at least a resolution of one nanosecond withoutlosing precision
  • It should be possible to coerce the new timestamp to float for backwardcompatibility

A resolution of one nanosecond is enough to support all current C functions.

The best resolution used by operating systems is one nanosecond. In practice,most clock accuracy is closer to microseconds than nanoseconds. So it soundsreasonable to use a fixed resolution of one nanosecond.

Number of nanoseconds (int)

A nanosecond resolution is enough for all current C functions and so atimestamp can simply be a number of nanoseconds, an integer, not a float.

The number of nanoseconds format has been rejected because it would require toadd new specialized functions for this format because it not possible todifferentiate a number of nanoseconds and a number of seconds just by checkingthe object type.

128-bits float

Add a new IEEE 754-2008 quad-precision binary float type. The IEEE 754-2008quad precision float has 1 sign bit, 15 bits of exponent and 112 bits ofmantissa. 128-bits float is supported by GCC (4.3), Clang and ICC compilers.

Python must be portable and so cannot rely on a type only available on someplatforms. For example, Visual C++ 2008 doesn’t support 128-bits float, whereasit is used to build the official Windows executables. Another example: GCC 4.3does not support __float128 in 32-bit mode on x86 (but GCC 4.4 does).

There is also a license issue: GCC uses the MPFR library for 128-bits float,library distributed under the GNU LGPL license. This license is not compatiblewith the Python license.

Note

The x87 floating point unit of Intel CPU supports 80-bit floats. This formatis not supported by the SSE instruction set, which is now preferred overfloat, especially on x86_64. Other CPU vendors don’t support 80-bit float.

datetime.datetime

The datetime.datetime type is the natural choice for a timestamp because it isclear that this type contains a timestamp, whereas int, float and Decimal areraw numbers. It is an absolute timestamp and so is well defined. It givesdirect access to the year, month, day, hours, minutes and seconds. It hasmethods related to time like methods to format the timestamp as string (e.g.datetime.datetime.strftime).

The major issue is that except os.stat(), time.time() andtime.clock_gettime(time.CLOCK_GETTIME), all time functions have an unspecifiedstarting point and no timezone information, and so cannot be converted todatetime.datetime.

datetime.datetime has also issues with timezone. For example, a datetime objectwithout timezone (unaware) and a datetime with a timezone (aware) cannot becompared. There is also an ordering issues with daylight saving time (DST) inthe duplicate hour of switching from DST to normal time.

datetime.datetime has been rejected because it cannot be used for functionsusing an unspecified starting point like os.times() or time.clock().

For time.time() and time.clock_gettime(time.CLOCK_GETTIME): it is alreadypossible to get the current time as a datetime.datetime object using:

datetime.datetime.now(datetime.timezone.utc)

For os.stat(), it is simple to create a datetime.datetime object from adecimal.Decimal timestamp in the UTC timezone:

datetime.datetime.fromtimestamp(value,datetime.timezone.utc)

Note

datetime.datetime only supports microsecond resolution, but can be enhancedto support nanosecond.

datetime.timedelta

datetime.timedelta is the natural choice for a relative timestamp because it isclear that this type contains a timestamp, whereas int, float and Decimal areraw numbers. It can be used with datetime.datetime to get an absolute timestampwhen the starting point is known.

datetime.timedelta has been rejected because it cannot be coerced to float andhas a fixed resolution. One new standard timestamp type is enough, Decimal ispreferred over datetime.timedelta. Converting a datetime.timedelta to floatrequires an explicit call to the datetime.timedelta.total_seconds() method.

Note

datetime.timedelta only supports microsecond resolution, but can be enhancedto support nanosecond.

Tuple of integers

To expose C functions in Python, a tuple of integers is the natural choice tostore a timestamp because the C language uses structures with integers fields(e.g. timeval and timespec structures). Using only integers avoids the loss ofprecision (Python supports integers of arbitrary length). Creating and parsinga tuple of integers is simple and fast.

Depending of the exact format of the tuple, the precision can be arbitrary orfixed. The precision can be choose as the loss of precision is smaller thanan arbitrary limit like one nanosecond.

Different formats have been proposed:

  • A: (numerator, denominator)
    • value = numerator / denominator
    • resolution = 1 / denominator
    • denominator > 0
  • B: (seconds, numerator, denominator)
    • value = seconds + numerator / denominator
    • resolution = 1 / denominator
    • 0 <= numerator < denominator
    • denominator > 0
  • C: (intpart, floatpart, base, exponent)
    • value = intpart + floatpart / baseexponent
    • resolution = 1 / baseexponent
    • 0 <= floatpart < baseexponent
    • base > 0
    • exponent >= 0
  • D: (intpart, floatpart, exponent)
    • value = intpart + floatpart / 10exponent
    • resolution = 1 / 10exponent
    • 0 <= floatpart < 10exponent
    • exponent >= 0
  • E: (sec, nsec)
    • value = sec + nsec × 10-9
    • resolution = 10-9 (nanosecond)
    • 0 <= nsec < 109

All formats support an arbitrary resolution, except of the format (E).

The format (D) may not be able to store the exact value (may loss of precision)if the clock frequency is arbitrary and cannot be expressed as a power of 10.The format (C) has a similar issue, but in such case, it is possible to usebase=frequency and exponent=1.

The formats (C), (D) and (E) allow optimization for conversion to float if thebase is 2 and to decimal.Decimal if the base is 10.

The format (A) is a simple fraction. It supports arbitrary precision, is simple(only two fields), only requires a simple division to get the floating pointvalue, and is already used by float.as_integer_ratio().

To simplify the implementation (especially the C implementation to avoidinteger overflow), a numerator bigger than the denominator can be accepted.The tuple may be normalized later.

Tuple of integers have been rejected because they don’t support arithmeticoperations.

Note

On Windows, theQueryPerformanceCounter() clock uses the frequency ofthe processor which is an arbitrary number and so may not be a power or 2 or10. The frequency can be read usingQueryPerformanceFrequency().

timespec structure

timespec is the C structure used to store timestamp with a nanosecondresolution. Python can use a type with the same structure: (seconds,nanoseconds). For convenience, arithmetic operations on timespec are supported.

Example of an incomplete timespec type supporting addition, subtraction andcoercion to float:

classtimespec(tuple):def__new__(cls,sec,nsec):ifnotisinstance(sec,int):raiseTypeErrorifnotisinstance(nsec,int):raiseTypeErrorasec,nsec=divmod(nsec,10**9)sec+=asecobj=tuple.__new__(cls,(sec,nsec))obj.sec=secobj.nsec=nsecreturnobjdef__float__(self):returnself.sec+self.nsec*1e-9deftotal_nanoseconds(self):returnself.sec*10**9+self.nsecdef__add__(self,other):ifnotisinstance(other,timespec):raiseTypeErrorns_sum=self.total_nanoseconds()+other.total_nanoseconds()returntimespec(*divmod(ns_sum,10**9))def__sub__(self,other):ifnotisinstance(other,timespec):raiseTypeErrorns_diff=self.total_nanoseconds()-other.total_nanoseconds()returntimespec(*divmod(ns_diff,10**9))def__str__(self):ifself.sec<0andself.nsec:sec=abs(1+self.sec)nsec=10**9-self.nsecreturn'-%i.%09u'%(sec,nsec)else:return'%i.%09u'%(self.sec,self.nsec)def__repr__(self):return'<timespec(%s,%s)>'%(self.sec,self.nsec)

The timespec type is similar to the format (E) of tuples of integer, exceptthat it supports arithmetic and coercion to float.

The timespec type was rejected because it only supports nanosecond resolutionand requires to implement each arithmetic operation, whereas the Decimal typeis already implemented and well tested.

Alternatives: API design

Add a string argument to specify the return type

Add a string argument to function returning timestamps, example:time.time(format=”datetime”). A string is more extensible than a type: it ispossible to request a format that has no type, like a tuple of integers.

This API was rejected because it was necessary to import implicitly modules toinstantiate objects (e.g. import datetime to create datetime.datetime).Importing a module may raise an exception and may be slow, such behaviour isunexpected and surprising.

Add a global flag to change the timestamp type

A global flag like os.stat_decimal_times(), similar to os.stat_float_times(),can be added to set globally the timestamp type.

A global flag may cause issues with libraries and applications expecting floatinstead of Decimal. Decimal is not fully compatible with float. float+Decimalraises a TypeError for example. The os.stat_float_times() case is differentbecause an int can be coerced to float and int+float gives float.

Add a protocol to create a timestamp

Instead of hard coding how timestamps are created, a new protocol can be addedto create a timestamp from a fraction.

For example, time.time(timestamp=type) would call the class methodtype.__fromfraction__(numerator, denominator) to create a timestamp object ofthe specified type. If the type doesn’t support the protocol, a fallback isused: type(numerator) / type(denominator).

A variant is to use a “converter” callback to create a timestamp. Examplecreating a float timestamp:

deftimestamp_to_float(numerator,denominator):returnfloat(numerator)/float(denominator)

Common converters can be provided by time, datetime and other modules, or maybea specific “hires” module. Users can define their own converters.

Such protocol has a limitation: the timestamp structure has to be decided onceand cannot be changed later. For example, adding a timezone or the absolutestart of the timestamp would break the API.

The protocol proposition was as being excessive given the requirements, butthat the specific syntax proposed (time.time(timestamp=type)) allows this to beintroduced later if compelling use cases are discovered.

Note

Other formats may be used instead of a fraction: see the tuple of integerssection for example.

Add new fields to os.stat

To get the creation, modification and access time of a file with a nanosecondresolution, three fields can be added to os.stat() structure.

The new fields can be timestamps with nanosecond resolution (e.g. Decimal) orthe nanosecond part of each timestamp (int).

If the new fields are timestamps with nanosecond resolution, populating theextra fields would be time-consuming. Any call to os.stat() would be slower,even if os.stat() is only called to check if a file exists. A parameter can beadded to os.stat() to make these fields optional, the structure would have avariable number of fields.

If the new fields only contain the fractional part (nanoseconds), os.stat()would be efficient. These fields would always be present and so set to zero ifthe operating system does not support sub-second resolution. Splitting atimestamp in two parts, seconds and nanoseconds, is similar to the timespectype and tuple of integers, and so have the same drawbacks.

Adding new fields to the os.stat() structure does not solve the nanosecondissue in other modules (e.g. the time module).

Add a boolean argument

Because we only need one new type (Decimal), a simple boolean flag can beadded. Example: time.time(decimal=True) or time.time(hires=True).

Such flag would require to do a hidden import which is considered as a badpractice.

The boolean argument API was rejected because it is not “pythonic”. Changingthe return type with a parameter value is preferred over a boolean parameter (aflag).

Add new functions

Add new functions for each type, examples:

  • time.clock_decimal()
  • time.time_decimal()
  • os.stat_decimal()
  • os.stat_timespec()
  • etc.

Adding a new function for each function creating timestamps duplicate a lot ofcode and would be a pain to maintain.

Add a new hires module

Add a new module called “hires” with the same API than the time module, exceptthat it would return timestamp with high resolution, e.g. decimal.Decimal.Adding a new module avoids to link low-level modules like time or os to thedecimal module.

This idea was rejected because it requires to duplicate most of the code of thetime module, would be a pain to maintain, and timestamps are used modules otherthan the time module. Examples: signal.sigtimedwait(), select.select(),resource.getrusage(), os.stat(), etc. Duplicate the code of each module is notacceptable.

Links

Python:

Other languages:

  • Ruby (1.9.3), theTime classsupports picosecond (10-12)
  • .NET framework,DateTime type:number of 100-nanosecond intervals that have elapsed since 12:00:00midnight, January 1, 0001. DateTime.Ticks uses a signed 64-bit integer.
  • Java (1.5),System.nanoTime():wallclock with an unspecified starting point as a number of nanoseconds, usea signed 64 bits integer (long).
  • Perl,Time::Hiref module:use float so has the same loss of precision issue with nanosecond resolutionthan Python float timestamps

Copyright

This document has been placed in the public domain.


Source:https://github.com/python/peps/blob/main/peps/pep-0410.rst

Last modified:2025-02-01 08:59:27 GMT


[8]ページ先頭

©2009-2025 Movatter.jp