Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue25402

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:More accurate estimation of the number of digits in int to decimal string conversion
Type:behaviorStage:resolved
Components:Interpreter CoreVersions:Python 3.6
process
Status:closedResolution:fixed
Dependencies:Superseder:
Assigned To: mark.dickinsonNosy List: mark.dickinson, python-dev, serhiy.storchaka, vstinner
Priority:normalKeywords:patch

Created on2015-10-14 11:42 byserhiy.storchaka, last changed2022-04-11 14:58 byadmin. This issue is nowclosed.

Files
File nameUploadedDescriptionEdit
long_to_decimal_string_number_of_digits.patchserhiy.storchaka,2015-10-14 11:42review
estimate_decimalbase_digits.pyvstinner,2015-10-15 08:34
Messages (5)
msg252985 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2015-10-14 11:42
Int to decimal string conversion (function long_to_decimal_string_internal() atObjects/longobject.c:1583) has a limitation. On 32-bit platform you can't convert integers larger than 2**2**31 (10**646456993). Proposed patch removes this limitation [*].It also decreases memory requirements for intermediate buffer on 10%. The size of intermediate buffer (in digits) depends on the size of the integer. Unpatched:For 15-bit digits: size*15/4/3 = size*1.25For 30-bit digits: size*30/9/3 = size*1.11Patched:For 15-bit digits: size*15/4/3.3 = size*1.14For 30-bit digits: size*30/9/3.3 = size*1.01[*] Converting such large integers to decimal string can be not finished for reasonable time, because it has quadratic complexity. On my netbook the estimated time of calculating str(2**2**31) is 5 years. But this is different issue.
msg253036 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2015-10-15 08:34
Currently, the code uses Py_ABS(Py_SIZE(x))*PyLong_SHIFT to estimate the upper-bound of the number of bits of the number x. It's a raw estimation, the difference can be up to 29 bits. We may try to compute the exact number of bits, x.bit_length().Python 3.5 estimate the number of "decimalbase" (10**9) digits using:def decimalbase_digits1(x):    bits = size(x) * PyLong_SHIFT    return 1 + bits // (3 * _PyLong_DECIMAL_SHIFT)I wrote a test to compute how many digits are overallocated (and unused): 552961 for this function. I'm not sure that "1+" is needed, since 3.0 is already lower than log2(10) (3.32...). If we compute the exact number of bits using the Python 3.5 function, it's a little bit better:def decimalbase_digits2(x):    bits = x.bit_length()    return 1 + bits // (3 * _PyLong_DECIMAL_SHIFT)=> 546250 digits (1% less). You propose a better estimation:def decimalbase_digits3(x):    digits = size(x)    d = (33 * _PyLong_DECIMAL_SHIFT) // (10 * PyLong_SHIFT - 33 * _PyLong_DECIMAL_SHIFT)    return 1 + digits + digits // dWith your estimation, only 504243 are overallocated (9% less than Python 3.5 function). But why only using 2 digits for log2(10) estimation? We can more digits:def decimalbase_digits4(x):    bits = size(x) * PyLong_SHIFT    return bits * 10000 // (33219 * _PyLong_DECIMAL_SHIFT)=> 491908 digits (11% less)According to my tests, the best function uses the number of bits and the better estimation of log2(10):def new_decimalbase_digits5(x):    bits = x.bit_length()    return bits * 10000 // (33219 * _PyLong_DECIMAL_SHIFT)=> 483424 digits (13% less)See attached for my tests.
msg253037 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2015-10-15 09:40
> But why only using 2 digits for log2(10) estimation?Because the difference between 3 and 3.3 is 10%, and the difference between 3.3 and exact log2(10) is only 1%. Yes, we can use more digits, but the effect of any additional digit is decreased in geometric progression.If use simple and fast formula that avoids integer multiplication overflow "digits + digits // d", the effect of additional precision is virtually vanished.>>> PyLong_SHIFT, _PyLong_DECIMAL_SHIFT = 15, 4>>> (3 * _PyLong_DECIMAL_SHIFT) // (1 * PyLong_SHIFT - 3 * _PyLong_DECIMAL_SHIFT)4>>> (33 * _PyLong_DECIMAL_SHIFT) // (10 * PyLong_SHIFT - 33 * _PyLong_DECIMAL_SHIFT)7>>> (33219 * _PyLong_DECIMAL_SHIFT) // (10000 * PyLong_SHIFT - 33219 * _PyLong_DECIMAL_SHIFT)7>>> PyLong_SHIFT, _PyLong_DECIMAL_SHIFT = 30, 9>>> (3 * _PyLong_DECIMAL_SHIFT) // (1 * PyLong_SHIFT - 3 * _PyLong_DECIMAL_SHIFT)9>>> (33 * _PyLong_DECIMAL_SHIFT) // (10 * PyLong_SHIFT - 33 * _PyLong_DECIMAL_SHIFT)99>>> (332 * _PyLong_DECIMAL_SHIFT) // (100 * PyLong_SHIFT - 332 * _PyLong_DECIMAL_SHIFT)249>>> (3321 * _PyLong_DECIMAL_SHIFT) // (1000 * PyLong_SHIFT - 3321 * _PyLong_DECIMAL_SHIFT)269>>> (33219 * _PyLong_DECIMAL_SHIFT) // (10000 * PyLong_SHIFT - 33219 * _PyLong_DECIMAL_SHIFT)290>>> (332192 * _PyLong_DECIMAL_SHIFT) // (100000 * PyLong_SHIFT - 332192 * _PyLong_DECIMAL_SHIFT)291>>> (332192809 * _PyLong_DECIMAL_SHIFT) // (100000000 * PyLong_SHIFT - 332192809 * _PyLong_DECIMAL_SHIFT)291The best approximation with minimal denominator is 485/146.>>> PyLong_SHIFT, _PyLong_DECIMAL_SHIFT = 15, 4>>> (485 * _PyLong_DECIMAL_SHIFT) // (146 * PyLong_SHIFT - 485 * _PyLong_DECIMAL_SHIFT)7>>> PyLong_SHIFT, _PyLong_DECIMAL_SHIFT = 30, 9>>> (485 * _PyLong_DECIMAL_SHIFT) // (146 * PyLong_SHIFT - 485 * _PyLong_DECIMAL_SHIFT)291
msg273869 -(view)Author: Roundup Robot (python-dev)(Python triager)Date: 2016-08-29 16:26
New changeset1902e1d79e25 by Mark Dickinson in branch 'default':Issue#25402: in int-to-decimal-string conversion, reduce intermediate storage requirements and relax restriction on converting large integers. Patch by Serhiy Storchaka.https://hg.python.org/cpython/rev/1902e1d79e25
msg273870 -(view)Author: Mark Dickinson (mark.dickinson)*(Python committer)Date: 2016-08-29 16:30
Patch and analysis LGTM. Thanks!
History
DateUserActionArgs
2022-04-11 14:58:22adminsetgithub: 69588
2016-08-29 16:30:48mark.dickinsonsetstatus: open -> closed
messages: +msg273870

assignee:mark.dickinson
resolution: fixed
stage: patch review -> resolved
2016-08-29 16:26:58python-devsetnosy: +python-dev
messages: +msg273869
2015-10-15 11:04:24r.david.murraysettitle: Accurater estimation of the number of digits in int to decimal string conversion -> More accurate estimation of the number of digits in int to decimal string conversion
2015-10-15 09:40:01serhiy.storchakasetmessages: +msg253037
2015-10-15 08:34:15vstinnersetfiles: +estimate_decimalbase_digits.py

messages: +msg253036
2015-10-15 07:29:37vstinnersetnosy: +vstinner
2015-10-14 11:42:43serhiy.storchakacreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp