Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

WiP Redesign the encoding of Longs in JavaScript.#5205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
sjrd wants to merge2 commits intoscala-js:main
base:main
Choose a base branch
Loading
fromsjrd:rt-long-expanded

Conversation

sjrd
Copy link
Member

@sjrdsjrd commentedJun 25, 2025
edited
Loading

Overall, we make the flat representation of longs as a pair of(lo, hi)the representation, at the Emitter level. There are no more instances ofRuntimeLong. The emitter flattens out all theLongs as follows:

  • A local variable of typelong becomes two local variables of typeint.
  • A field of typelong becomes two fields of typeint.
  • AnArray[Long] is stored as anInt32Array with twice as many elements, alternatinglo andhi words.
  • Method parameters of typelong are expanded astwo parameters of typeint.

For theresult of method parameters, there is a trick. That one is the most "debatable", in that I think there are contending alternatives that may be faster. When a method returns aLong, it stores thehi word in a global variable$resHi, then returns thelo word. At call site, we read back the$resHi global. We used a similar trick for non-inlined methods ofRuntimeLong, with avar resultHi field ofobject RuntimeLong. Now this is moved to the emitter to take care of.

All the methods ofRuntimeLong explicitly take "expanded" versions of their parameters:abs takes two parameters of typeInt;add takes 4. Shifts take 3 parameters of typeInt: thelo, thehi, and the shift amount. Theresult, however, is aLong.

In order to allow them toconstructLong results from their words, we introduce a magic methodRuntimeLong.pack(lo, hi), whose body is filled in by theDesugarer with a specialTransient(PackLong(lo, hi)). It cannot be the compiler, because we cannot serialize transients. And it cannot wait for the emitter, because the optimizer definitely wants to see thePackLongs to unpack them. An alternative would be to introduce a newBinaryOp, but I think that's worse because it bakes an implementation detail of the emitter into the IR. The fact that wecan even do this PR is a testament to the current abstraction level of our IR.

These changes significantly speed up even the SHA512 benchmark, even though it performs most computations on stack already anyway (the improvements must come from the arrays, in this case). I haven't measured benchmarks that extensively useLong fields yet, but I expect them to get dramatic speedups.

For the optimizer, this makes things a lot simpler. Instead of having special-cases forRuntimeLong everywhere, we basically have a uniquewithSplitLong method to deal with them. That method can split onePreTransform of typelong into twoPreTransforms of typeints, so that they can be given to the inlined methods ofRuntimeLong. We introduce a newLongPairReplacement forLocalDefs that aggregate a pair of(lo, hi)LocalDefs (typically the result of a splitPackLong). As a nice bonus, the IR checker now passes withRuntimeLong after the optimizer.


One big caveat for now: Closure breaks the new encoding, sometimes. I think it gets confused by the$resHi variable and evaluation order of function params. For example if we pass the result of aLong method to aLong parameter, we emit

x.foo(y.bar(1),$resHi);

Duringy.bar(1), the method modifies$resHi. It is then read right after to be passed as second argument tox.foo.

@sjrdsjrdforce-pushed thert-long-expanded branch 6 times, most recently from7161430 to2fe28e7CompareJune 28, 2025 11:09
@sjrd
Copy link
MemberAuthor

@gzm0 This is not yet ready for a code review. However, it is a good time to get your opinion on the overall new compilation scheme and architectural changes. WDYT?

@sjrdsjrdforce-pushed thert-long-expanded branch 2 times, most recently fromf35c5dd to3e51123CompareJune 28, 2025 15:12
@sjrd
Copy link
MemberAuthor

sjrd commentedJul 5, 2025

Some results from thebounce benchmark, where I have replaced the PRNG by aju.SplittableRandom. OurSplittableRandom was not specially crafted to optimize itsLong usages (unlikeju.Random), so it is probably characteristic of "everyday"Long manipulations.

ConfigBeforeAfter
default fastLink38.537.0
optimized semantics37.333.6
opt sems + minify37.333.4

"Before" is#5204, and "After" is this PR.

It's not dramatic, but it's a strong 10% performance improvement.

@gzm0
Copy link
Contributor

gzm0 commentedJul 9, 2025

My thoughts on this based on the PR description and a very brief look at the code:

  • We should definitely do this. Much cleaner from so many POVs (and my attempts to make the IR checker work with aPreTransCast went nowhere so far ;-) ).
  • I agree with the sentiment that the "pack" operation should remain linker internal. As an alternative to the desugarer, have you considered patching the IR ininjectedIRFiles of the Emitter?
  • I do not feel great about the$resHi return value solution. I suspect it's our least bad alternative, but some more evidence would be nice. Things to consider from the top of my head:
    • Fixed size 2 typed array
    • Destructuring assignment: IIUC a JS JIT could optimize this, but it feels unlikely: I'd expect the destructuring assignment to be desugared into a lower representation early in the JIT pipeline.

@sjrd
Copy link
MemberAuthor

sjrd commentedJul 9, 2025

  • I agree with the sentiment that the "pack" operation should remain linker internal. As an alternative to the desugarer, have you considered patching the IR ininjectedIRFiles of the Emitter?

I have considered it, and it's still in the cards. The only downside, AFAICT, is that theClassDefChecker must then let it pass, even in the "initial" configuration that is available to user space. They'd have to really work for it, though, so it's probably fine.

  • I do not feel great about the$resHi return value solution. I suspect it's our least bad alternative, but some more evidence would be nice. Things to consider from the top of my head:

    • Fixed size 2 typed array
    • Destructuring assignment: IIUC a JS JIT could optimize this, but it feels unlikely: I'd expect the destructuring assignment to be desugared into a lower representation early in the JIT pipeline.

These are two of the options I'm considering. We'll have to measure which is faster.$resHi was the easiest to implement, precisely because of the evaluation order trick that I suspect GCC is not happy about. All the other alternatives will require more work in the function emitter to forcibly unnest alllong expressions.

As a nice bonus, the IR checker now passes with `RuntimeLong`.
Sadly, that doesn't fix the GCC issue.It may have performance, especially in multi-module outputs, as wedon't need to mutate a foreign var.
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

2 participants
@sjrd@gzm0

[8]ページ先頭

©2009-2025 Movatter.jp