Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-127937: convert decimal module to use import API for ints (PEP 757)#127925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
vstinner merged 35 commits intopython:mainfromskirpichev:long_export-decimal
Jan 24, 2025

Conversation

skirpichev
Copy link
Member

@skirpichevskirpichev commentedDec 13, 2024
edited
Loading

Benchmarkrefpatch
int(Decimal(1<<7))648 ns474 ns: 1.37x faster
int(Decimal(1<<38))740 ns501 ns: 1.48x faster
int(Decimal(1<<300))2.06 us2.02 us: 1.02x faster
int(Decimal(1<<3000))115 us115 us: 1.00x faster
Geometric mean(ref)1.20x faster
>>> sys.int_info[:2](30, 4)
# bench_Decimal-to-int.pyimportpyperffromdecimalimportDecimalvalues= ['1<<7','1<<38','1<<300','1<<3000']runner=pyperf.Runner()forvinvalues:d=Decimal(eval(v))bn='int(Decimal('+v+'))'runner.bench_func(bn,int,d)

@picnixz
Copy link
Member

hide _PyLong_FromDigits()? it's not used outside of the longobject.c anymore

Let's not hide this. Maybe someone is using it (it was removed then restored IIRC).

news

Not needed I think, unless you want to indicate the performance gain (it's always nice to know that something is faster). I did report the improvements offnmatch.translate, so I think you can report those improvements as well.

skirpichevand others added2 commitsDecember 14, 2024 03:40
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
n = (mpd_sizeinbase(x, 2) + bpd - 1) / bpd;
PyLongWriter *writer = PyLongWriter_Create(mpd_isnegative(x), n,
(void**)&ob_digit);
/* mpd_sizeinbase can overestimate size by 1 digit, set it to zero. */
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

BTW, this looks as a bug in the mpdecimal. C.f. the GNU GMP, the mpz_sizeinbase docs says: "If base is a power of 2, the result is always exact".

@skirpichev
Copy link
MemberAuthor

Let's not hide this. Maybe someone is using it (it was removed then restored IIRC).

I've updated the pr descriptions with my research. So far, I've found just one use case.

At least, I think we should deprecate (not soft) this. This apparently affects not so much projects and there is now a public alternative.@picnixz, what do you think?

@skirpichevskirpichev marked this pull request as ready for reviewDecember 14, 2024 01:05
@picnixz
Copy link
Member

At least, I think we should deprecate (not soft) this

I would be fine with deprecating it, saying which alternative to use, so that we can simply remove it in some later versions. I think Victor was the one who removed and restored it so we should ask him as well.

@picnixz
Copy link
Member

should dec_from_long() be modified here? (To use the PyLong_Export API.) I would prefer to do this in a separate PR.

If you prefer doing it in a follow-up PR because you fear it would be too hard to review, then it's better. If the change is minimal, we can do it this one (I didn't check the code to change)

@skirpichev
Copy link
MemberAuthor

If the change is minimal, we can do it this one

You can estimate them looking on the gmpy2 pr (referenced in the PEP):aleaxit/gmpy#495 In principle, I don't think that this will complicate review to much. On another hand, changes looks logically independent. I would rather include here deprecation.

@picnixz
Copy link
Member

picnixz commentedDec 14, 2024
edited
Loading

  • Let's changedec_from_long in another PR since the changes are independent (sorry it's 3 AM here and I don't have much energy).
  • For deprecating_PyLong_FromDigits, maybe it's better to make a separate PR so that we have a dedicated NEWS entry and re-use the issue that actually removed the private API (and not the issue that reverted the removal). WDYT? (we would also be able to changePyLong_Copy accordingly)
skirpichev reacted with thumbs up emoji

@skirpichev

This comment was marked as outdated.

@skirpichevskirpichev marked this pull request as draftDecember 14, 2024 05:07
@skirpichevskirpichev changed the titlegh-102471: convert decimal module to use PyLongWriter API (PEP 757)gh-102471: convert decimal module to use import/export API for ints (PEP 757)Dec 14, 2024
@skirpichevskirpichev marked this pull request as ready for reviewDecember 14, 2024 07:10
* cleanup: forgotten PyLongWriter_Discard, pylong variable* clarify news
Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM.

mpd_qexport_*() functions used here with assumption, that no resizing
occur, i.e. len was obtained by a call to mpd_sizeinbase.

IMO it's a reasonable trade-off and an acceptable risk.

@serhiy-storchaka
Copy link
Member

It is not guaranteed, and there is no way to enforce that resize does not occur inmpd_qexport_*() functions.

How to estimate the risk? If Python has undefined behavior in one of billion cases, is it acceptable risk?

Co-authored-by: Victor Stinner <vstinner@python.org>
@skirpichev
Copy link
MemberAuthor

skirpichev commentedJan 8, 2025
edited
Loading

It is not guaranteed, and there is no way to enforce that resize does not occur in mpd_qexport_*() functions.

@serhiy-storchaka, we have a confirmation from the library author, that this expectation is correct, unless libm is broken. I guess it's not just one place where we depend on quality of system libraries.

Or do you believe that mpd_sizeinbase() can underestimate size with correct log10? If so, it's a bug. Lets just fix one. Here is the function (IIRC it's same in latest upstream version):

size_t
mpd_sizeinbase(constmpd_t*a,uint32_tbase)
{
doublex;
size_tdigits;
doubleupper_bound;
assert(mpd_isinteger(a));
assert(base >=2);
if (mpd_iszero(a)) {
return1;
}
digits=a->digits+a->exp;
#ifdefCONFIG_64
/* ceil(2711437152599294 / log10(2)) + 4 == 2**53 */
if (digits>2711437152599294ULL) {
returnSIZE_MAX;
}
upper_bound= (double)((1ULL<<53)-1);
#else
upper_bound= (double)(SIZE_MAX-1);
#endif
x= (double)digits /log10(base);
return (x>upper_bound) ?SIZE_MAX : (size_t)x+1;
}

Edit:
In fact, we need much simpler case, asbase is a power of 2. So, we wantndigits * log2(10)/shift. This should be a correct bound:

(size_t)(3.321928094887363*((ndigits + shift - 1)/shift))

Forshift=30 andndigits ~1<<53 (upper_bound for typical case) - it will overestimate size in just 1 digit.

@vstinner
Copy link
Member

@picnixz: Would you mind to review the latest PR version? It changed a lot since last month.

Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A few final comments on English wording and some variables. Otherwise, LGTM. Sorry Victor, the ping got under my radar.

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
@vstinner
Copy link
Member

@serhiy-storchaka: Are you ok with this change? Or are you worried about the mpd_sizeinbase() issue?

@skirpichev
Copy link
MemberAuthor

I would prefer not do this, but to make some progress we could introduce a temporary buffer.@vstinner?

@skirpichev
Copy link
MemberAuthor

Ok, with a buffer (latest change) I got something like this:

Benchmarkrefpatch
int(Decimal(1<<7))637 ns480 ns: 1.33x faster
int(Decimal(1<<38))735 ns514 ns: 1.43x faster
int(Decimal(1<<300))2.06 us2.12 us: 1.03x slower
Geometric mean(ref)1.16x faster

Benchmark hidden because not significant (1): int(Decimal(1<<3000))

I'm not sure if the third case is a real speed regression or just a noise. But I think it can be acceptable, as this clear Serhiy concerns, that blocked the pr before. Hence, I push this.

@vstinner
@picnixz

Copy link
Member

@serhiy-storchakaserhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM. Later we can optimize it more. For very large values the total time is dominated by non-linear conversion time, so additional allocation and copying does not matter.

skirpichev reacted with hooray emoji
@picnixz
Copy link
Member

LGTM (I've reviewed the two new commits and it's fine).

Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM

@vstinnervstinnerenabled auto-merge (squash)January 24, 2025 10:59
@vstinnervstinner merged commit3d8fc8b intopython:mainJan 24, 2025
41 checks passed
@skirpichevskirpichev deleted the long_export-decimal branchJanuary 24, 2025 11:10
@vstinner
Copy link
Member

Merged, thanks@skirpichev for the tedious change!

@skirpichev
Copy link
MemberAuthor

Thanks for reviews. I'll open an issue to track possible improvement (no temporary buffer).

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@picnixzpicnixzpicnixz approved these changes

@vstinnervstinnervstinner approved these changes

@serhiy-storchakaserhiy-storchakaserhiy-storchaka approved these changes

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@skirpichev@picnixz@vstinner@serhiy-storchaka

[8]ページ先頭

©2009-2025 Movatter.jp