- Notifications
You must be signed in to change notification settings - Fork401
WiP Redesign the encoding of Longs in JavaScript.#5205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
7161430 to2fe28e7Comparesjrd commentedJun 28, 2025
@gzm0 This is not yet ready for a code review. However, it is a good time to get your opinion on the overall new compilation scheme and architectural changes. WDYT? |
f35c5dd to3e51123Comparesjrd commentedJul 5, 2025
Some results from the
"Before" is#5204, and "After" is this PR. It's not dramatic, but it's a strong 10% performance improvement. |
gzm0 commentedJul 9, 2025
My thoughts on this based on the PR description and a very brief look at the code:
|
sjrd commentedJul 9, 2025
I have considered it, and it's still in the cards. The only downside, AFAICT, is that the
These are two of the options I'm considering. We'll have to measure which is faster. |
32c14dd tod26f4c0Compare3f29eb1 to029a6faComparegzm0 commentedAug 31, 2025
Just a thought I had: would it make sense to build this with a different backend encoding (maybe just an object with fields lo, hi), so we can benefit from the optimizer improvements without having to wait on GCC? |
sjrd commentedAug 31, 2025
Yes, I think it would make sense. Paradoxically, that's harder to do than it looks, so I haven't done it yet. |
As a nice bonus, the IR checker now passes with `RuntimeLong`.
Sadly, that doesn't fix the GCC issue.It may have performance, especially in multi-module outputs, as wedon't need to mutate a foreign var.
Uh oh!
There was an error while loading.Please reload this page.
Overall, we make the flat representation of longs as a pair of
(lo, hi)the representation, at the Emitter level. There are no more instances ofRuntimeLong. The emitter flattens out all theLongs as follows:longbecomes two local variables of typeint.longbecomes two fields of typeint.Array[Long]is stored as anInt32Arraywith twice as many elements, alternatingloandhiwords.longare expanded astwo parameters of typeint.For theresult of method parameters, there is a trick. That one is the most "debatable", in that I think there are contending alternatives that may be faster. When a method returns a
Long, it stores thehiword in a global variable$resHi, then returns theloword. At call site, we read back the$resHiglobal. We used a similar trick for non-inlined methods ofRuntimeLong, with avar resultHifield ofobject RuntimeLong. Now this is moved to the emitter to take care of.All the methods of
RuntimeLongexplicitly take "expanded" versions of their parameters:abstakes two parameters of typeInt;addtakes 4. Shifts take 3 parameters of typeInt: thelo, thehi, and the shift amount. Theresult, however, is aLong.In order to allow them toconstruct
Longresults from their words, we introduce a magic methodRuntimeLong.pack(lo, hi), whose body is filled in by theDesugarerwith a specialTransient(PackLong(lo, hi)). It cannot be the compiler, because we cannot serialize transients. And it cannot wait for the emitter, because the optimizer definitely wants to see thePackLongs to unpack them. An alternative would be to introduce a newBinaryOp, but I think that's worse because it bakes an implementation detail of the emitter into the IR. The fact that wecan even do this PR is a testament to the current abstraction level of our IR.These changes significantly speed up even the SHA512 benchmark, even though it performs most computations on stack already anyway (the improvements must come from the arrays, in this case). I haven't measured benchmarks that extensively use
Longfields yet, but I expect them to get dramatic speedups.For the optimizer, this makes things a lot simpler. Instead of having special-cases for
RuntimeLongeverywhere, we basically have a uniquewithSplitLongmethod to deal with them. That method can split onePreTransformof typelonginto twoPreTransforms of typeints, so that they can be given to the inlined methods ofRuntimeLong. We introduce a newLongPairReplacementforLocalDefs that aggregate a pair of(lo, hi)LocalDefs (typically the result of a splitPackLong). As a nice bonus, the IR checker now passes withRuntimeLongafter the optimizer.One big caveat for now: Closure breaks the new encoding, sometimes. I think it gets confused by the
$resHivariable and evaluation order of function params. For example if we pass the result of aLongmethod to aLongparameter, we emitDuring
y.bar(1), the method modifies$resHi. It is then read right after to be passed as second argument tox.foo.