Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-133703: dict: fix calculate_log2_keysize()#133809

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
methane merged 3 commits intopython:mainfrommethane:fix-dict-keysize
May 11, 2025

Conversation

methane
Copy link
Member

@methanemethane commentedMay 10, 2025
edited by bedevere-appbot
Loading

@methanemethane added type-bugAn unexpected behavior, bug, or error needs backport to 3.13bugs and security fixes needs backport to 3.14bugs and security fixes labelsMay 10, 2025
@angela-tarantula
Copy link
Contributor

angela-tarantula commentedMay 10, 2025
edited
Loading

Good PR. Since this code is ahotspot where every instruction matters, I found a way to shave off a few more instructions. This doesn't have to block the PR, but it's something we could consider in a follow-up optimization.

_Py_bit_length() doesn't need to checkif (x != 0) in this context.

Look at the GCC/Clang path:

_Py_bit_length(unsigned longx)
{
#if (defined(__clang__)|| defined(__GNUC__))
if (x!=0) {
// __builtin_clzl() is available since GCC 3.4.
// Undefined behavior for x == 0.
return (int)sizeof(unsigned long)*8-__builtin_clzl(x);
}
else {
return0;
}
#elif defined(_MSC_VER)

We don't need to checkif (x != 0) becausex == Py_MAX(minsize, PyDict_MINSIZE) - 1 is guaranteed nonzero, so we end up paying a compare+branch there for no reason. Likewise, we don't need to checkif (_BitScanReverse(&msb, x)) in the MSVC branch:

#elif defined(_MSC_VER)
// _BitScanReverse() is documented to search 32 bits.
Py_BUILD_ASSERT(sizeof(unsigned long) <=4);
unsigned longmsb;
if (_BitScanReverse(&msb,x)) {
return (int)msb+1;
}
else {
return0;
}
#else

If the tradeoff is worth it, we can create a_Py_bit_length_nonzero() that is optimized for the guarantee thatx > 0. With your go-ahead, I could file an issue for that. I only bring this up because yousuggested earlier thatPy_MAX()'s conditional logic almost made it inferior to bitwise operations simply due to the overhead of extra instructions on arm64 and amd64.

@methane
Copy link
MemberAuthor

_Py_bit_length is inline function. compiler optimize away unnecessary 0 comparison.

@methane
Copy link
MemberAuthor

I had suggested avoiding Py_MAX because it can not be eliminated statically.

@methanemethane merged commit92337f6 intopython:mainMay 11, 2025
39 checks passed
@methanemethane deleted the fix-dict-keysize branchMay 11, 2025 05:44
@miss-islington-app
Copy link

Thanks@methane for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestMay 11, 2025
(cherry picked from commit92337f6)Co-authored-by: Inada Naoki <songofacandy@gmail.com>
@miss-islington-app
Copy link

Sorry,@methane, I could not cleanly backport this to3.13 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 92337f666e8a076a68305a8d6dc8bc9c095000e9 3.13

@bedevere-app
Copy link

GH-133861 is a backport of this pull request to the3.14 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.14bugs and security fixes labelMay 11, 2025
@bedevere-app
Copy link

GH-133862 is a backport of this pull request to the3.13 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.13bugs and security fixes labelMay 11, 2025
methane added a commit that referenced this pull requestMay 11, 2025
methane added a commit that referenced this pull requestMay 11, 2025
(cherry picked from commit92337f6)Co-authored-by: Inada Naoki <songofacandy@gmail.com>
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure⚠️⚠️⚠️

Hi! The buildbotPPC64LE RHEL8 3.14 (tier-2) has failed when building commit5c9f0ae.

What do you need to do:

  1. Don't panic.
  2. Checkthe buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1688/builds/8) and take a look at the build logs.
  4. Check if the failure is related to this commit (5c9f0ae) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1688/builds/8

Failed tests:

  • test.test_multiprocessing_spawn.test_processes

Failed subtests:

  • test_interrupt - test.test_multiprocessing_spawn.test_processes.WithProcessesTestProcess.test_interrupt

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):  File"/home/buildbot/buildarea/3.14.cstratak-RHEL8-ppc64le/build/Lib/test/_test_multiprocessing.py", line578, intest_interruptself.assertEqual(exitcode,1)~~~~~~~~~~~~~~~~^^^^^^^^^^^^^AssertionError:-2 != 1Traceback (most recent call last):  File"<string>", line1, in<module>from multiprocessing.spawnimport spawn_main; spawn_main(tracker_fd=5,pipe_handle=9)~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  File"/home/buildbot/buildarea/3.14.cstratak-RHEL8-ppc64le/build/Lib/multiprocessing/spawn.py", line122, inspawn_main    exitcode= _main(fd, parent_sentinel)  File"/home/buildbot/buildarea/3.14.cstratak-RHEL8-ppc64le/build/Lib/multiprocessing/spawn.py", line132, in_mainself= reduction.pickle.load(from_parent)  File"/home/buildbot/buildarea/3.14.cstratak-RHEL8-ppc64le/build/Lib/test/test_multiprocessing_spawn/test_processes.py", line2, in<module>from test._test_multiprocessingimport install_tests_in_module_dict  File"/home/buildbot/buildarea/3.14.cstratak-RHEL8-ppc64le/build/Lib/test/_test_multiprocessing.py", line2766, in<module>classSayWhenError(ValueError):passKeyboardInterruptFAILTraceback (most recent call last):  File"<string>", line1, in<module>from multiprocessing.spawnimport spawn_main; spawn_main(tracker_fd=6,pipe_handle=8)~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  File"/home/buildbot/buildarea/3.14.cstratak-RHEL8-ppc64le/build/Lib/multiprocessing/spawn.py", line122, inspawn_main    exitcode= _main(fd, parent_sentinel)  File"/home/buildbot/buildarea/3.14.cstratak-RHEL8-ppc64le/build/Lib/multiprocessing/spawn.py", line132, in_mainself= reduction.pickle.load(from_parent)  File"/home/buildbot/buildarea/3.14.cstratak-RHEL8-ppc64le/build/Lib/test/test_multiprocessing_spawn/test_processes.py", line2, in<module>from test._test_multiprocessingimport install_tests_in_module_dict  File"<frozen importlib._bootstrap>", line1371, in_find_and_load  File"<frozen importlib._bootstrap>", line1342, in_find_and_load_unlocked  File"<frozen importlib._bootstrap>", line938, in_load_unlocked  File"<frozen importlib._bootstrap_external>", line758, inexec_module  File"<frozen importlib._bootstrap_external>", line891, inget_code  File"<frozen importlib._bootstrap_external>", line514, in_compile_bytecodeKeyboardInterruptFAIL

@angela-tarantula
Copy link
Contributor

I had suggested avoiding Py_MAX because it can not be eliminated statically.

@methane Sorry for misunderstanding, I just couldn't find the comment anymore where you said that. I think it's gone.

_Py_bit_length is inline function. compiler optimize away unnecessary 0 comparison.

You're right about most modern compilers, but not all of them. I tested the no-zero-check concept on MSVC x86, a Tier 1 platform, and it does actually shave a few assembly instructions.

Notice the conditional je is present after the bsr:

_calculate_log2_keysize PROC        mov     eax, 8        cmp     DWORD PTR _minsize$[esp-4], eax        cmovg   eax, DWORD PTR _minsize$[esp-4]        dec     eax        bsr     eax, eax        je      SHORT $LN4@calculate_        inc     eax        ret     0$LN4@calculate_:        xor     al, al        ret     0_calculate_log2_keysize ENDP

But not here:

_calculate_log2_keysize_nonzero PROC        mov     ecx, 8        cmp     DWORD PTR _minsize$[esp-4], ecx        cmovg   ecx, DWORD PTR _minsize$[esp-4]        dec     ecx        bsr     eax, ecx        inc     al        ret     0_calculate_log2_keysize_nonzero ENDP

You can see it for yourselfhere. Yeah, it's "only" a je + unused branch, but it's unexpectedly running on everydictresize on 32-bit Windows for no reason, so I thought I'd tell you.

@methane
Copy link
MemberAuthor

You're right about most modern compilers, but not all of them. I tested the no-zero-check concept on MSVC x86, a Tier 1 platform, and it does actually shave a few assembly instructions.

We don't use_Py_bit_length on MSVC. We have nonzero specialization for it already.

@methane
Copy link
MemberAuthor

You're right about most modern compilers, but not all of them. I tested the no-zero-check concept on MSVC x86, a Tier 1 platform, and it does actually shave a few assembly instructions.

We don't use_Py_bit_length on MSVC. We have nonzero specialization for it already.

Sorry, I was wrong. We use _Py_bit_length on x86-32.
But I don't feel enough attraction of it. Tier 1 doesn't mean 100% effort for optimization and dictresize is not hot as dict lookup.

@angela-tarantula
Copy link
Contributor

That's understandable.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

Assignees

@methanemethane

Labels
type-bugAn unexpected behavior, bug, or error
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@methane@angela-tarantula@bedevere-bot

[8]ページ先頭

©2009-2025 Movatter.jp