Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 757 – C API to import-export Python integers

Author:
Sergey B Kirpichev <skirpichev at gmail.com>,Victor Stinner <vstinner at python.org>
Discussions-To:
Discourse thread
Status:
Final
Type:
Standards Track
Created:
13-Sep-2024
Python-Version:
3.14
Post-History:
14-Sep-2024
Resolution:
08-Dec-2024

Table of Contents

Important

This PEP is a historical document. The up-to-date, canonical documentation can now be found at theExport API and thePyLongWriter API.

×

SeePEP 1 for how to propose changes.

Abstract

Add a new C API to import and export Python integers,int objects:especiallyPyLongWriter_Create() andPyLong_Export() functions.

Rationale

Projects such asgmpy2,SAGE andPython-FLINT access directly Python“internals” (thePyLongObject structure) or use an inefficienttemporary format (hex strings for Python-FLINT) to import andexport Pythonint objects. The Pythonint implementationchanged in Python 3.12 to add a tag and “compact values”.

In the 3.13 alpha 1 release, the private undocumented_PyLong_New()function had been removed, but it is being used by these projects toimport Python integers. The private function has been restored in 3.13alpha 2.

A public efficient abstraction is needed to interface Python with theseprojects without exposing implementation details. It would allow Pythonto change its internals without breaking these projects. For example,implementation for gmpy2 was changed recently for CPython 3.9 andfor CPython 3.12.

Specification

Layout API

Data needed byGMP-likeimport-exportfunctions.

structPyLongLayout
Layout of an array of “digits” (“limbs” in the GMP terminology), used torepresent absolute value for arbitrary precision integers.

UsePyLong_GetNativeLayout() to get the native layout of Pythonint objects, used internally for integers with “big enough”absolute value.

See alsosys.int_info which exposes similar information to Python.

uint8_tbits_per_digit
Bits per digit. For example, a 15 bit digit means that bits 0-14contain meaningful information.
uint8_tdigit_size
Digit size in bytes. For example, a 15 bit digit will require at least2 bytes.
int8_tdigits_order
Digits order:
  • 1 for most significant digit first
  • -1 for least significant digit first
int8_tdigit_endianness
Digit endianness:
  • 1 for most significant byte first (big endian)
  • -1 for least significant byte first (little endian)
constPyLongLayout*PyLong_GetNativeLayout(void)
Get the native layout of Pythonint objects.

See thePyLongLayout structure.

The function must not be called before Python initialization nor afterPython finalization. The returned layout is valid until Python isfinalized. The layout is the same for all Python sub-interpreters andso it can be cached.

Export API

structPyLongExport
Export of a Pythonint object.

There are two cases:

int64_tvalue
The native integer value of the exportedint object.Only valid ifdigits isNULL.
uint8_tnegative
1 if the number is negative, 0 otherwise.Only valid ifdigits is notNULL.
Py_ssize_tndigits
Number of digits indigits array.Only valid ifdigits is notNULL.
constvoid*digits
Read-only array of unsigned digits. Can beNULL.

IfPyLongExport.digits is notNULL, a private field of thePyLongExport structure stores a strong reference to the Pythonint object to make sure that that structure remains valid untilPyLong_FreeExport() is called.

intPyLong_Export(PyObject*obj,PyLongExport*export_long)
Export a Pythonint object.

export_long must point to aPyLongExport structure allocatedby the caller. It must not beNULL.

On success, fill in*export_long and return 0.On error, set an exception and return -1.

PyLong_FreeExport() must be called when the export is no longerneeded.

CPython implementation detail: This function always succeeds ifobj isa Pythonint object or a subclass.

On CPython 3.14, no memory copy is needed inPyLong_Export(), it’s justa thin wrapper to expose Pythonint internal digits array.

voidPyLong_FreeExport(PyLongExport*export_long)
Release the exportexport_long created byPyLong_Export().

CPython implementation detail: CallingPyLong_FreeExport() isoptional ifexport_long->digits isNULL.

Import API

ThePyLongWriter API can be used to import an integer.

structPyLongWriter
A Pythonint writer instance.

The instance must be destroyed byPyLongWriter_Finish() orPyLongWriter_Discard().

PyLongWriter*PyLongWriter_Create(intnegative,Py_ssize_tndigits,void**digits)
Create aPyLongWriter.

On success, allocate*digits and return a writer.On error, set an exception and returnNULL.

negative is1 if the number is negative, or0 otherwise.

ndigits is the number of digits in thedigits array. It must begreater than 0.

digits must not be NULL.

After a successful call to this function, the caller should fill in thearray of digitsdigits and then callPyLongWriter_Finish() to geta Pythonint.The layout ofdigits is described byPyLong_GetNativeLayout().

Digits must be in the range [0;(1<<bits_per_digit)-1](where thebits_per_digit is the number of bitsper digit).Any unused most significant digits must be set to0.

Alternately, callPyLongWriter_Discard() to destroy the writerinstance without creating anint object.

On CPython 3.14, thePyLongWriter_Create() implementation is a thinwrapper to the private_PyLong_New() function.

PyObject*PyLongWriter_Finish(PyLongWriter*writer)
Finish aPyLongWriter created byPyLongWriter_Create().

On success, return a Pythonint object.On error, set an exception and returnNULL.

The function takes care of normalizing the digits and converts theobject to a compact integer if needed.

The writer instance and thedigits array are invalid after the call.

voidPyLongWriter_Discard(PyLongWriter*writer)
Discard aPyLongWriter created byPyLongWriter_Create().

writer must not beNULL.

The writer instance and thedigits array are invalid after the call.

Optimize import for small integers

Proposed import API is efficient for large integers. Compared toaccessing directly Python internals, the proposed import API can have asignificant performance overhead on small integers.

For small integers of a few digits (for example, 1 or 2 digits), existing APIscan be used:

Implementation

Benchmarks

Code:

/* Query parameters of Python’s internal representation of integers. */constPyLongLayout*layout=PyLong_GetNativeLayout();size_tint_digit_size=layout->digit_size;intint_digits_order=layout->digits_order;size_tint_bits_per_digit=layout->bits_per_digit;size_tint_nails=int_digit_size*8-int_bits_per_digit;intint_endianness=layout->digit_endianness;

Export:PyLong_Export() with gmpy2

Code:

staticintmpz_set_PyLong(mpz_tz,PyObject*obj){staticPyLongExportlong_export;if(PyLong_Export(obj,&long_export)<0){return-1;}if(long_export.digits){mpz_import(z,long_export.ndigits,int_digits_order,int_digit_size,int_endianness,int_nails,long_export.digits);if(long_export.negative){mpz_neg(z,z);}PyLong_FreeExport(&long_export);}else{constint64_tvalue=long_export.value;if(LONG_MIN<=value&&value<=LONG_MAX){mpz_set_si(z,value);}else{mpz_import(z,1,-1,sizeof(int64_t),0,0,&value);if(value<0){mpz_ttmp;mpz_init(tmp);mpz_ui_pow_ui(tmp,2,64);mpz_sub(z,z,tmp);mpz_clear(tmp);}}}return0;}

Reference code:mpz_set_PyLong() in the gmpy2 master for commit 9177648.

Benchmark:

importpyperffromgmpy2importmpzrunner=pyperf.Runner()runner.bench_func('1<<7',mpz,1<<7)runner.bench_func('1<<38',mpz,1<<38)runner.bench_func('1<<300',mpz,1<<300)runner.bench_func('1<<3000',mpz,1<<3000)

Results on Linux Fedora 40 with CPU isolation, Python built in releasemode:

Benchmarkrefpep757
1<<791.3 ns89.9 ns: 1.02x faster
1<<38120 ns94.9 ns: 1.27x faster
1<<300196 ns203 ns: 1.04x slower
1<<3000939 ns945 ns: 1.01x slower
Geometric mean(ref)1.05x faster

Import:PyLongWriter_Create() with gmpy2

Code:

staticPyObject*GMPy_PyLong_From_MPZ(MPZ_Object*obj,CTXT_Object*context){if(mpz_fits_slong_p(obj->z)){returnPyLong_FromLong(mpz_get_si(obj->z));}size_tsize=(mpz_sizeinbase(obj->z,2)+int_bits_per_digit-1)/int_bits_per_digit;void*digits;PyLongWriter*writer=PyLongWriter_Create(mpz_sgn(obj->z)<0,size,&digits);if(writer==NULL){returnNULL;}mpz_export(digits,NULL,int_digits_order,int_digit_size,int_endianness,int_nails,obj->z);returnPyLongWriter_Finish(writer);}

Reference code:GMPy_PyLong_From_MPZ() in the gmpy2 master for commit 9177648.

Benchmark:

importpyperffromgmpy2importmpzrunner=pyperf.Runner()runner.bench_func('1<<7',int,mpz(1<<7))runner.bench_func('1<<38',int,mpz(1<<38))runner.bench_func('1<<300',int,mpz(1<<300))runner.bench_func('1<<3000',int,mpz(1<<3000))

Results on Linux Fedora 40 with CPU isolation, Python built in releasemode:

Benchmarkrefpep757
1<<756.7 ns56.2 ns: 1.01x faster
1<<300191 ns213 ns: 1.12x slower
Geometric mean(ref)1.03x slower

Benchmark hidden because not significant (2): 1<<38, 1<<3000.

Backwards Compatibility

There is no impact on the backward compatibility, only new APIs areadded.

Rejected Ideas

Support arbitrary layout

It would be convenient to support arbitrary layout to import-exportPython integers.

For example, it was proposed to add alayout parameter toPyLongWriter_Create() and alayout member to thePyLongExport structure.

The problem is that it’s more complex to implement and not reallyneeded. What’s strictly needed is only an API to import-export using thePython “native” layout.

If later there are use cases for arbitrary layouts, new APIs can beadded.

Don’t addPyLong_GetNativeLayout() function

Currently, most required information forint import/export is alreadyavailable viaPyLong_GetInfo() (andsys.int_info). We alsocan add more (like order of digits), this interface doesn’t poses anyconstraints on future evolution of thePyLongObject.

The problem is that thePyLong_GetInfo() returns a Python object,named tuple, not a convenient C structure and that might distractpeople from using it in favor e.g. of current semi-private macros likePyLong_SHIFT andPyLong_BASE.

Provide mpz_import/export-like API instead

The other approach to import/export data fromint objects might befollowing: expect, that C extensions provide contiguous buffers that CPythonthen exports (or imports) theabsolute value of an integer.

API example:

structPyLongLayout{uint8_tbits_per_digit;uint8_tdigit_size;int8_tdigits_order;};size_tPyLong_GetDigitsNeeded(PyLongObject*obj,PyLongLayoutlayout);intPyLong_Export(PyLongObject*obj,PyLongLayoutlayout,void*buffer);PyLongObject*PyLong_Import(PyLongLayoutlayout,void*buffer);

This might work for the GMP, as it hasmpz_limbs_read() andmpz_limbs_write() functions, that can provide required access tointernals ofmpz_t. Other libraries may require using temporarybuffers and then mpz_import/export-like functions on their side.

The major drawback of this approach is that it’s much more complex on theCPython side (i.e. actual conversion between different layouts). For example,implementation of thePyLong_FromNativeBytes() and thePyLong_AsNativeBytes() (together provided restricted version of therequired API) in the CPython took ~500 LOC (c.f. ~100 LOC in the currentimplementation).

Dropvalue field from the export API

With this suggestion, only one export type will exist (array of “digits”). Ifsuch view is not available for a given integer, it will be either emulated byexport functions or thePyLong_Export() will return an error. In bothcases, it’s assumed that users will use other C-API functions to get “smallenough” integers (i.e., that fits to some machine integer types), like thePyLong_AsLongAndOverflow(). ThePyLong_Export() will beinefficient (or just fail) in this case.

An example:

staticintmpz_set_PyLong(mpz_tz,PyObject*obj){intoverflow;#if SIZEOF_LONG == 8longvalue=PyLong_AsLongAndOverflow(obj,&overflow);#else/* Windows has 32-bit long, so use 64-bit long long instead */longlongvalue=PyLong_AsLongLongAndOverflow(obj,&overflow);#endifPy_BUILD_ASSERT(sizeof(value)==sizeof(int64_t));if(!overflow){if(LONG_MIN<=value&&value<=LONG_MAX){mpz_set_si(z,(long)value);}else{mpz_import(z,1,-1,sizeof(int64_t),0,0,&value);if(value<0){mpz_ttmp;mpz_init(tmp);mpz_ui_pow_ui(tmp,2,64);mpz_sub(z,z,tmp);mpz_clear(tmp);}}}else{staticPyLongExportlong_export;if(PyLong_Export(obj,&long_export)<0){return-1;}mpz_import(z,long_export.ndigits,int_digits_order,int_digit_size,int_endianness,int_nails,long_export.digits);if(long_export.negative){mpz_neg(z,z);}PyLong_FreeExport(&long_export);}return0;}

This might look as a simplification from the API designer point of view, butwill be less convenient for end users. They will have to follow Pythondevelopment, benchmark different variants for exporting small integers (is thatobvious why above case was chosen instead ofPyLong_AsInt64()?), maybesupport different code paths for various CPython versions or across differentPython implementations.

Discussions

Copyright

This document is placed in the public domain or under theCC0-1.0-Universal license, whichever is more permissive.


Source:https://github.com/python/peps/blob/main/peps/pep-0757.rst

Last modified:2024-12-16 07:23:59 GMT


[8]ページ先頭

©2009-2025 Movatter.jp