Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.3k
GH-100425: Timing experiment: For builtin_sum, try replacing Fast2Sum with 2Sum#100860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
I see no difference either, on Linux with an AMD Zen 2 chip |
Both with and without optimizations I see no difference. System: Linux, gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1), Intel(R) Core(TM) i7-4710MQ CPU @ 2.50GHz |
Thank you both. It would be nice to hear from a Windows person as well. |
On Windows (default PCbuild/build.bat, no PGO) the timings vary a lot on my system (Intel(R) Core(TM) i7-4710MQ CPU @ 2.50GHz, Windows 10, VS 2019). For this PR, measurements within 5 minutes: I can confirm that the minimum time for the test is roughly the same for main and this PR. |
Thank you. I appreciate it. @mdickinson Given that 2Sum and Fast2Sum have the same performance in the context ofbuiltin.sum(), do we have a non-performance reason to choose one over the other? Or should I leave the |
@rhettinger Leaving as-is sounds good to me. The two should be functionally identical, so performance is just about the only thing that would justify choosing one over the other. |

Uh oh!
There was an error while loading.Please reload this page.
On the Apple M1 Max, this change makes no difference. I get 303/304 nsec per loop before and after the edit.
Would anyone care to run this on their builds and report back the results?