- Notifications
You must be signed in to change notification settings - Fork396
Introduce NewLambda to synthesize instances of SAM types.#5003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
@gzm0 This is based on my experiments with trying to optimize the Wasm output. For everything else, I managed to make the Wasm output on average faster than Scala.js without touching the IR, or only in a way that also benefits JS (see#4998). For Scala functions, however, they remained desperately slow no matter what I tried. With these changes to the IR, overall Wasm reaches 0.85% the geomean run time wrt. JS -- so, 15% faster on average. When you get the chance, I'd like your broad opinion on this. Is this something we could commit to at this stage? Or should we only venture this far if and when Wasm gets more mature/battle-tested? There's obviously a catch-22 here: given the performance results, Scala.js-on-Wasm might not be production-viable without this change; but without production experience we might not implement it. That doesn't mean we couldn't wait at least a full release cycle and reports of Wasm beingcorrect, even if it is slow. |
fd2b383
toa98c757
CompareI have been thinking about this, and I think what is happening is actually the other way around: the Scala.js compiler is over optimizing for the JS backend. Closures have two separate uses ATM IIUC:
The compiler is making the decision to perform the second optimization. However, it seems to only be an optimization that works well for the JS backend. This makes sense, since it has native support for lambdas, whereas WASM doesn't. How would the world look, if we instead moved the Scala lambda optimization from the compiler to the JS backend? IIUC to decide whether we can emit a Scala X class as a closure:
It seems to me that this is quite similar to the inlineable init optimization. And it also seems it has the potential to allow for more optimization (n.b. post optimizer closure optimization or closure optimization for classes where only a single method is used). WDYT? |
Ah and I forgot: IIUC if we do this, then the WASM backend shouldn't need any adjustment at all, because "typed closures" are just classes at this point (and IIUC the runtime layout would be quite similar, except for the vtable/typedata pointer overhead). |
Additional 2 things I forgot:
|
Identifying all these conditions would be really tricky. Just However, we could go the other way: not creating a class at all, and instead have a One benefit is that we can then apply the best optimization also for SAM interfaces other than Scala functions (e.g., In terms of code size, even for Wasm the single class carrying a Typed Closure is very beneficial. So expanding full classes is not necessarily the best strategy. As far as I can tell, the best strategy would be a closure-carrying single class, with optimized dispatch that eagerly type-tests for that single class. So when calling a That is something we could do with an LMF-style opcode, which we would have a hard time reverse engineering from already expanded classes. |
95a3e59
tod7d256c
Compare1785a96
tof3e5d03
Compare2a7e46b
to58fe5f1
Compare@gzm0 I added a commit that introduces As the commit message says, the implementation on the linker side is butchered at the moment. The idea is to support discussion of the IR changes at the moment. WDYT? If we add generalization to SAM types, this would also improve JS: all the lambdas for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think theNewLambda
tree looks very promising, both in terms of simplifying the compiler and in giving more semantic information to the linker.
What I'm wondering is whether it is worth keeping theTypedClosure
abstraction in the core IR: if we could only haveNewLambda
the complexity increase in the IR would be minimal. (if it helps the optimizer, we could still consider introducing typed closures as transients).
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
I don't think we can remove |
2e2513f
tob23b5e5
Compare
Ah, sorry, I wasn't clear. I understand that we need a It might mean that we have to fall back to a full class when we need bridges for SAMs, but maybe that's OK? |
Ah, I see. So something like caseclassNewLambda(descriptor:Descriptor,captureParams:List[ParamDef],params:List[ParamDef],body:Tree,captureValues:List[Tree]) (note that That can work from an IR-only point of view, indeed. The main issue is that it's not clear how the Currently it generates a field of a My hunch is that answering these questions will yield a design that is even more complicated than having |
Ah, I haven't looked that far. Are we sure we want to desugar the lambdas? It feels to me:
|
I'm pretty sure we want to desugar the lambdas. If we don't, then every analysis and optimization that relates to the class hierarchy analysis gets more complicated. Figuring out the analyzer's job, in particular in the presence of reflective proxy and/or default methods, gets really tricky. In the optimizer, a method call now needs to handle actual target methods as well as hidden methods of lambdas. Currently there are very few changes in the optimizer, and they're all somewhat duplicating for |
I see, yeah that makes sense. So maybe what could be an in-between option is to disallow (free-standing) TypedClosures trees and types in serialized IR? This is somewhat similar to records (whether to have a TypedClosure inside a NewLambda or flatten it is TBD). This way, we can use TypedClosures in the linker pipeline but do not unnecessarily increase the IR interface. |
That seems doable. It will require a bit more work in the javalib IR cleaner (it currently lowers Scala functions to the underlying typed closures, but it could be custom functional interfaces instead), but nothing too dramatic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Partial review. Sharing because I think it contains stuff worth iterating on already.
I have fully reviewed up until the "TODO: Continue here" comment in Analyzer.scala.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Analyzer.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Analyzer.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Analyzer.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Infos.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Infos.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Addressed comments inir/
andcompiler/
so far.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
gzm0 commentedMar 2, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
I just realized that I do not understand at all, why we still need custom box processing in GenJSCode with this change. My understanding of the situation before this PR is that we use untyped closures to represent lambdas. As a result, we need to adjust boxes because the closure boundary requires boxed types (anys). However, with this change, we lift this limitation, so we should (?) be able to take the types scalac generates directly. As such, I would expect the necessary boxes to be already present. Just TBC: I do understand that boxing/unboxing can/will be required between the specific interface a method implements and its implementation, and potentially even between the I'll try to keep digging some more about this, but any help is appreciated. |
Yes, the JVM backend has to do a lot with that as well. Thebackend itself doesn't do much. The work is split between another phase ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Some replies + review of WASM (only nits there).
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Infos.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/FunctionEmitter.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/FunctionEmitter.scalaShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/TypeTransformer.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/WasmContext.scalaShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/WasmContext.scalaShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/WasmContext.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I believed I have addressed all the comments so far.
The main question that is probably still under debate is the design ofSyntheticClassKind
.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Analyzer.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Infos.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Infos.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/FunctionEmitter.scalaShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/FunctionEmitter.scalaShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/FunctionEmitter.scalaShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Some rebuttals and minor comments I found.
It's a bit tricky to review ATM because of the rebase on the name refactoring (using commits for both review history and intended target history at the same time is not working too well with Github's comment tracking).
But I think we can nevertheless make progress on the overall design here.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Infos.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Infos.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/FunctionEmitter.scalaShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/FunctionEmitter.scalaShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/analyzer/Analyzer.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
Sorry. I expected#5136 to be mostly a no-brainer that would be quickly reviewed and merged 😅 |
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/backend/wasmemitter/FunctionEmitter.scalaShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Review of LambdaSynthesizer: Either there is a big problem with transient type refs, or there is a big chunk I do not understand :) either way, probably worth iterating (and I'm taking a break anyways :P).
for (intf <- descriptor.interfaces) | ||
digestBuilder.updateUTF8String(intf.encoded) | ||
// FIXME This is not efficient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Just a reminder. OK to not address in this PR IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
linker/shared/src/main/scala/org/scalajs/linker/frontend/LambdaSynthesizer.scala OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
val suffixBuilder = new java.lang.StringBuilder(".$$Lambda$") | ||
for (b <- digest) { | ||
val i = b & 0xff | ||
suffixBuilder.append(Character.forDigit(i >> 4, 16)).append(Character.forDigit(i & 0x0f, 16)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
FYI: I have considered whether we should factor out / share this withInternalModuleIDGenerator
and I agree with the (implicit or explicit) decision not to.
Uh oh!
There was an error while loading.Please reload this page.
linker/shared/src/main/scala/org/scalajs/linker/frontend/LambdaSynthesizer.scalaShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
def makeClassInfo(descriptor: NewLambda.Descriptor, className: ClassName): ClassInfo = { | ||
val methodInfos = Array.fill(MemberNamespace.Count)(Map.empty[MethodName, MethodInfo]) | ||
val fFieldName = FieldName(className, SimpleFieldName("f")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nit: factor out the namef
to deduplicate withmakeClassDef
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Just some typos / optional things. LGTM to merge after squashing.
* but we might as well cache them together. | ||
*/ | ||
private val syntheticLambdaNamesCache = | ||
mutable.Map.empty[NewLambda.Descriptor, (ClassName, MethodName)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Note: I agree with the design decision that the memory leak here is acceptable.
* | ||
* Although `NewLambda` nodes themselves are desugared in the `Desugarer`, | ||
* the corresponding synthetic *classes* already have an existence after the | ||
* `BasedLinker`. They must, since they must participate in the CHA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
* `BasedLinker`.They must, since they must participate in theCHA | |
* `BaseLinker`.They must, since they must participate in theCRA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
CHA = Class Hierarchy Analysis.
What is CRA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Seems we have no tests at all for the new nodes.
Maybe that's OK in the first iteration: We've been working on this PR for quite a while. IDK. I'll leave it up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Similar comment here than on ClassDefChecker test.
Seems we have no new tests at all, but maybe that's OK for now, so we can get this PR out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Uh oh!
There was an error while loading.Please reload this page.
The `NewLambda` node creates an instance of an anonymous class froma `Descriptor` and a closure `fun`. The `Descriptor` specifies theshape of the anonymous class: a super class, a list of interfaces toimplement, and the name of a single non-constructor method to provide.The body of the method calls the `fun` closure.At link time, the Analyzer and BaseLinker synthesize a unique suchanonymous class per `Descriptor`. In practice, all the lambdas fora given target type share a common `Descriptor`. This is notably thecase for all the Scala functions of arity N.`NewLambda` replaces the need for special `AnonFunctionN` classes inthe library. Instead, classes of the right shape are synthesized atlink-time.The scheme can also be used for most LambdaMetaFactory-stylelambdas, although our `NewLambda` does not support bridgegeneration. In the common case where no bridges are necessary, wenow also generate a `NewLambda`. This generalizes the code sizeoptimization of having only one class per `Descriptor` to non-Scalafunctions.In order to truly support LMF-style lambdas, the closure `fun` musttake parameters that match the (erased) type in their superinterface.Previously, for Scala `FunctionN`, we knew by construction that theparameters and result types were always `any`, and so JS `Closure`swere good enough. Now, we need closures that can accept differenttypes. This is where typed closures come into play (see below).---When bridges are required, we still generate a custom class fromthe compiler backend. In that case, we statically inline the closurebody in the produced SAM implementation.We have to do this not to expose typed closures across method calls.Moreover, we need the better static types for the parameters to beable to inline the closures without too much hassle. So this change*has* to be done in lockstep with the rest of this commit.---A typed closure is a `Closure` that does not have any semantics forJS interop. This is stronger than `Char`, which is "merely" opaqueto JS. A `Char` can still be passed to JS and has a meaningful`toString()`. A typed closure *cannot* be passed to JS in any way.That is enforced by making their type *not* a subtype of `any`(like record types).Since a typed closure has no JS interop semantics, it is free tostrongly, statically type its parameters and result type.Additionally, we can freely choose its representation in the bestpossible way for the given target. On JS, that remains an arrowfunction. On Wasm, however, we represent it as a pair of`(capture data pointer, function pointer)`. This allows to compilethem in an efficient way that does not require going through a JSbridge closure. The latter has been shown to have a devastatingimpact on performance when a Scala function is used in a tightloop.The type of a typed closure is a `ClosureType`. It records itsparameter types and its result type. Closure types are non-variant:they are only subtypes of themselves. As mentioned, they are notsubtypes of `any`. They are however subtypes of `void` andsupertypes of `nothing`. Unfortunately, they must also be nullableto have a default value, so they have nullable and non-nullablealternatives.To call a typed closure, we introduce a dedicated application node`ApplyTypedClosure`. IR checking ensures that actual argumentsmatch the expected parameter types. The result type is directlyused as the type of the application.There are no changes to the source language. In particular, thereis no way to express typed closures or their types at the userlevel. They are only used for `NewLambda` nodes.In fact, typed closures and `ApplyTypedClosure`s are notfirst-class at the IR level. Before desugaring, typed closures areonly allowed as direct children of `NewLambda` nodes. Desugaringtransforms `NewLambda` nodes into `New`s of the synthesizedanonymous classes. At that point, the two typed closure nodesbecome first-class expression trees.---For Scala functions, these changes have no real impact on the JSoutput (only marginal naming differences). On Wasm, however, theymake Scala functions much, much faster. Before, a Scala function ina tight loop would cause a Wasm implementation to be, in the worstmeasured case, 20x slower than on JS. After these changes, similarbenchmarks become significantly faster on Wasm than on JS.
Thanks a lot@gzm0 for reviewing this deep PR over the months! |
fec4ae7
intoscala-js:mainUh oh!
There was an error while loading.Please reload this page.
Now JSArrayConstr forArray[T](...)Seq[T](...)List[T](...)is the problem
Now JSArrayConstr forArray[T](...)Seq[T](...)List[T](...)is the problem
Uh oh!
There was an error while loading.Please reload this page.
At this moment, this is a more a Request For Comments than a real PR. It is built on#4988 since the motivation is entirely about making the Wasm output fast instead of disastrously slow when Scala functions are involved.Was:Introduce typed closures.
Now the focus in more on
NewLambda
than typedClosure
nodes.The
NewLambda
node creates an instance of an anonymous class from aDescriptor
and a closurefun
. TheDescriptor
specifies the shape of the anonymous class: a super class, a list of interfaces to implement, and the name of a single non-constructor method to provide. The body of the method calls thefun
closure.At link time, the Analyzer and BaseLinker synthesize a unique such anonymous class per
Descriptor
. In practice, all the lambdas for a given target type share a commonDescriptor
. This is notably the case for all the Scala functions of arity N.NewLambda
replaces the need for specialAnonFunctionN
classes in the library. Instead, classes of the right shape are synthesized at link-time.The scheme can also be used for most
LambdaMetaFactory
-style lambdas, although ourNewLambda
does not support bridge generation. In the common case where no bridges are necessary, we now also generate aNewLambda
. This generalizes the code size optimization of having only one class perDescriptor
to non-Scala functions.In order to truly support LMF-style lambdas, the closure
fun
must take parameters that match the (erased) type in their superinterface. Previously, for ScalaFunctionN
, we knew by construction that the parameters and result types were alwaysany
, and so JSClosure
s were good enough. Now, we need closures that can accept different types. This is where typed closures come into play (see below).When bridges are required, we still generate a custom class from the compiler backend. In that case, we statically inline the closure body in the produced SAM implementation.
We have to do this not to expose typed closures across method calls. Moreover, we need the better static types for the parameters to be able to inline the closures without too much hassle. So this changehas to be done in lockstep with the rest of this commit.
A typed closure is a
Closure
that does not have any semantics for JS interop. This is stronger thanChar
, which is "merely" opaque to JS. AChar
can still be passed to JS and has a meaningfultoString()
. A typed closurecannot be passed to JS in any way. That is enforced by making their typenot a subtype ofany
(like record types).Since a typed closure has no JS interop semantics, it is free to strongly, statically type its parameters and result type.
Additionally, we can freely choose its representation in the best possible way for the given target. On JS, that remains an arrow function. On Wasm, however, we represent it as a pair of
(capture data pointer, function pointer)
. This allows to compile them in an efficient way that does not require going through a JS bridge closure. The latter has been shown to have a devastating impact on performance when a Scala function is used in a tight loop.The type of a typed closure is a
ClosureType
. It records its parameter types and its result type. Closure types are non-variant: they are only subtypes of themselves. As mentioned, they are not subtypes ofany
. They are however subtypes ofvoid
and supertypes ofnothing
. Unfortunately, they must also be nullable to have a default value, so they have nullable and non-nullable alternatives.To call a typed closure, we introduce a dedicated application node
ApplyTypedClosure
. IR checking ensures that actual arguments match the expected parameter types. The result type is directly used as the type of the application.There are no changes to the source language. In particular, there is no way to express typed closures or their types at the user level. They are only used for
NewLambda
nodes.In fact, typed closures and
ApplyTypedClosure
s are not first-class at the IR level. Before desugaring, typed closures are only allowed as direct children ofNewLambda
nodes. Desugaring transformsNewLambda
nodes intoNew
s of the synthesized anonymous classes. At that point, the two typed closure nodes become first-class expression trees.For Scala functions, these changes have no real impact on the JS output (only marginal naming differences). On Wasm, however, they make Scala functions much, much faster. Before, a Scala function in a tight loop would cause a Wasm implementation to be, in the worst measured case, 20x slower than on JS. After these changes, similar benchmarks become significantly faster on Wasm than on JS.