- Notifications
You must be signed in to change notification settings - Fork401
Optimize divisions and remainders by constants.#5167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
b0ca504 to0ef8849Compareeee96e7 to864ebaeCompare3eac5c3 to615ec6aComparesjrd commentedJul 15, 2025
@gzm0 This PR is now ready for review. It sent me through many detours before I could finish it. :p |
95a590f tobe78b6aCompare32113aa to467dfceCompare467dfce to048e31fCompare
gzm0 left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'm releasing a partial review to share some progress.
The overall architecture looks good, I'm mostly still trying to understand the relevant pieces of Hacker's Delight so I can cross match them with the code.
All files that do not have an explicit comment are fully reviewed at this point.
| * be available when we link a javalib < 1.20. | ||
| * | ||
| * Its user-land IR is easy enough to hard-code. However, we lose the | ||
| * handling as a RuntimeLOng intrinsic done by the optimizer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| * handlingasaRuntimeLOng intrinsic done by the optimizer. | |
| * handlingasaRuntimeLong intrinsic done by the optimizer. |
| implicitscope:Scope):TailRec[Tree]= { | ||
| implicitvalpos= pretrans.pos | ||
| defexpand(methodName:MethodName,targs:PreTransform*):TailRec[Tree]= { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| defexpandLongOp(methodName:MethodName,targs:PreTransform*):TailRec[Tree]= { |
(for me, the diff renders wrong, I mean renameexpand ->expandLongOp).
| privatevalintrinsics= | ||
| Intrinsics.buildIntrinsics(config.coreSpec.esFeatures, isWasm) | ||
| privatelazyvalintegerDivisions= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
IIUC the ctor doesn't do anything.
| privatelazyvalintegerDivisions= | |
| privatevalintegerDivisions= |
| } | ||
| valnegativeDivisor= isSigned&& int.lt(divisor, int.zero) | ||
| valabsDivisor=if (negativeDivisor) int.negate(divisor)else divisor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| valabsDivisor=if (negativeDivisor)int.negate(divisor)else divisor | |
| valabsDivisor= int.abs(divisor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
No, that would be incorrect.int.abs would unconditionally treat negative numbers as such. But ifisSigned is false, "negative" numbers must be treated as positive, and therefore must not be given toabs! Which is whynegativeDivisor hasisSigned &&.
| val (isSigned, isLong)= opmatch { | ||
| caseBinaryOp.Int_/|BinaryOp.Int_%=> (true,false) | ||
| caseBinaryOp.Long_/|BinaryOp.Long_%=> (true,true) | ||
| caseBinaryOp.Int_unsigned_/|BinaryOp.Int_unsigned_%=> (false,false) | ||
| caseBinaryOp.Long_unsigned_/|BinaryOp.Long_unsigned_%=> (false,true) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| val (isSigned, isLong)= opmatch { | |
| caseBinaryOp.Int_/|BinaryOp.Int_%=> (true,false) | |
| caseBinaryOp.Long_/|BinaryOp.Long_%=> (true,true) | |
| caseBinaryOp.Int_unsigned_/|BinaryOp.Int_unsigned_%=> (false,false) | |
| caseBinaryOp.Long_unsigned_/|BinaryOp.Long_unsigned_%=> (false,true) | |
| } | |
| valisLong= int.bitSize==64 |
Then, with the suggestion below, you can entirely dropop from the arguments.
| for (k<-1 to63) | ||
| test(1L<< (64- k),0,0,1L<< k) | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Note to self: continue here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The split destroyed this information. It was located below the magic tests, but above the shape tests and fuzz tests.
| test(0x2aaaaaaaaaaaaaabL,0,1,12L) | ||
| test(0xa3d70a3d70a3d70bL,1,4,25L) | ||
| test(0x20c49ba5e353f7cfL,0,4,125L) | ||
| test(0x346dc5d63886594bL,0,7,625L) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
TODO: Understand why add is here but not on HD.
| // ----------------------------------------------------------------- | ||
| // --- Testing the specific shape of trees for selected divisors --- | ||
| // ----------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Could you provide some insight in the value these tests provide? It feels that the code they exercise (after unit tests of magic) is not so complicated to warrant an additional test layer before the fuzzer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
These tests notably ensures that the best path is taken for each category. For example, that we don't produce the generic path for powers of 2.
Also, I found them to be the most useful while writing the code. When something goes wrong (other than the magic number computations), these are the test that directly show you what went wrong and what to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Note: Incomplete review (up to continue here comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Note: Incomplete review (up to continue here comment).
gzm0 commentedOct 19, 2025
Note: I've immediately re-requested a review from myself, so this PR shows up in my review list. |
048e31f to832b3dfCompare
sjrd left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Addressed comments so far, I believe.
| // ----------------------------------------------------------------- | ||
| // --- Testing the specific shape of trees for selected divisors --- | ||
| // ----------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
These tests notably ensures that the best path is taken for each category. For example, that we don't produce the generic path for powers of 2.
Also, I found them to be the most useful while writing the code. When something goes wrong (other than the magic number computations), these are the test that directly show you what went wrong and what to fix.
| } | ||
| valnegativeDivisor= isSigned&& int.lt(divisor, int.zero) | ||
| valabsDivisor=if (negativeDivisor) int.negate(divisor)else divisor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
No, that would be incorrect.int.abs would unconditionally treat negative numbers as such. But ifisSigned is false, "negative" numbers must be treated as positive, and therefore must not be given toabs! Which is whynegativeDivisor hasisSigned &&.
| test(0x2aaaaaab,0,1,12) | ||
| test(0x51eb851f,0,3,25) | ||
| test(0x10624dd3,0,3,125) | ||
| test(0x68db8bad,0,8,625) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
TBH I was surprised that they did not include them in HD. They are an integral part of the magic computation, IMO.
Following the techniques described in Hacker's Delight, Chapter 10.Benchmarking suggested that is this generally worth it, except fornon-powers of 2 for a) int operations on JS and b) long operationson Wasm.For the long divisions on JS, this yields a 6x speedup, despite thefact that the resulting code needs 16 (!) primitive intmultiplications.
832b3df to51166acCompare
Uh oh!
There was an error while loading.Please reload this page.
Following the techniques described in Hacker's Delight, Chapter 10.
Benchmarking suggested that is this generally worth it, except for non-powers of 2 for a) int operations on JS and b) long operations on Wasm.
For the long divisions on JS, this yields a 6x speedup, despite the fact that the resulting code needs 16 (!) primitive int multiplications.