Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork540
fix: Add globalPreload to ts-node/esm for node 20#2009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Not sure what kind of test should be added for this, since it just returns a string. Just comparing against a fixture feels kind of redundant? |
Hm, this definitely needs a bit more work, because itbreaks on node versions thatdon't have the off-thread loader hooks. (Ie, everything before v20.) I'm not sure the best idiomatic approach to that in this project, but I'm not sure how to detect the situation where it's required, other than sniffing the version. |
ljharb commentedMay 6, 2023
If it helps, in node 19 vs node 20, this file: exportfunctionglobalPreload(){console.log('globalPreload');return'';}exportfunctiongetGlobalPreloadCode(){console.log('getGlobalPreloadCode');return'';} logs, in node 19:
and in node 20:
… so, you could probably intercept console.log at module level, catch the experimental warning (and suppress it), and use it to indicate whether you were in 20+ or not? |
@ljharb ooh sneaky, that might work. I'll look into that. |
ssalbdivad commentedMay 6, 2023
Would this allow Eagerly awaiting the fix on this so we can support development on Node 20 🙏 |
Historically I use version sniffing for this stuff, so I'd go with that. It's happened several times before that ts-node needs to implement multiple behaviors depending on the version of node or ts. Something to keep in mind: In typechecking mode, is the typechecking work being repeated on both threads? Keeping in mind that typechecking one file involves parsing the others, so CJS files typecheck with type info from ESM files and vice-versa. Repeated typechecking work is not a dealbreaker, but it's something we should at least document in this thread. |
codecovbot commentedMay 6, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Codecov Report
... and3 files with indirect coverage changes 📢 Have feedback on the report?Share it here. |
isaacs commentedMay 7, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Added a commit to @cspotcode As far as I can tell, with this change, I haven't looked into type checking, but my guess is, if |
Got some time to dig into this. The issue with |
Aha, of course. The
The only way this can work on node 20 is for |
Ok, got Verified that typechecking doesnot get run twice in the presence of an off-thread loader, which in hindsight makes sense, since either the source is being loaded and transpiled once in the loader thread, or once in the main (only) thread, but never both. The whole point is that the @cspotcode PTAL when you get a chance :) |
Thanks, I haven't had a chance to take a look yet, but WRT the double-typechecking: |
I've been trying to throw some complicated scenarios at it, but I'm not sure how to go about triggering the situation you're thinking of here. The userland program isn't ever loaded or typechecked by TS in the loader thread. So if there's double-typechecking happening, it seems to me that it'd be either (a) the loading of ts files from within ts-node itself (or its deps), or (b) already a problem without a loader-thread (ie, if double-checking is happening in a single-threaded loader environment). Maybe there's something I'm missing? But it seems like this can't possibly make the problem significantly worse. If you have a test or example I can poke at, I'm happy to try to dig in further. |
Er, rather, it's notexecuted on the loader thread. Obviously the code isloaded in the loader thread. But the typechecking happens when it compiles it to JS, and that only happens in one place. What actually ends up in the main thread is JavaScript going straight to the node VM. |
cspotcode commentedMay 25, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
If double-typecheck is happening, the only visible side-effect would be higher CPU usage. So it's not a problem, per-se, it just means potentially a performance regression on node 20. If code on the main thread does The scenario I imagine is:
Where The loader thread's Then the main thread does |
cspotcode commentedMay 25, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
The scenario above could also happen if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Looks good, thanks. I wanted to give a single, complete review after testing this locally, but I've been busy. These are the notes I have so far.
- We should move as much of the logic into
child-loader.ts
andesm.ts
, out of the.mjs
files which are meant to be shims as thin as possible. globalPreload
should be part of the hooks returned by ourcreateEsmHooks
API
esm.mjs Outdated
// Affordance for node 20, where load() happens in an isolated thread | ||
const offThreadLoader = versionGteLt(process.versions.node, '20.0.0'); | ||
export const globalPreload = () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Should this hook be included in the return value ofhttps://typestrong.org/ts-node/api/index.html#createEsmHooks?
That API is meant for anyone wanting to wrap our loader in their own logic. They'd do--loader my-loader.mjs
with theirmy-loader.mjs
exporting all the functions they get from callingcreateEsmHooks()
, perhaps wrapping them to implement new behavior.
This API predates support for multiple loaders, but also it should be possible to compose loaders in code so that end-users don't have to pass--loader
twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Everything is just as composable as it was, with the exception thatglobalPreload
is a bit weird to stack. (You have to append all the strings together wrapped in IIFE's so they don't clobber each others' vars.)
But multiple loaders have been supported longer thanglobalPreload
was needed, so that might not matter.
child-loader.mjs Outdated
const require = createRequire(fileURLToPath(import.meta.url)); | ||
// TODO why use require() here? I think we can just `import` | ||
/** @type {import('./dist/child-loader')} */ | ||
const childLoader = require('./dist/child/child-loader'); | ||
export const { resolve, load, getFormat, transformSource } = childLoader; | ||
// On node v20, we cannot lateBind the hooks from outside the loader thread | ||
// so it has to be done here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Technically, can we lateBind by sending config through the MessagePort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yeah, that's effectively what's happening here. It's calling bootstrap to late-bind the hooks with the state passed in on the loader URL.
I suppose it could get that information via the message port, but if we have the config we need on the URL, I'm not sure what that gets us?
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
src/child/spawn-child.ts Outdated
const child = spawn( | ||
process.execPath, | ||
[ | ||
'--require', | ||
require.resolve('./child-require.js'), | ||
'--loader', | ||
// Node on Windows doesn't like `c:\` absolute paths here; must be `file:///c:/` | ||
pathToFileURL(require.resolve('../../child-loader.mjs')).toString(), | ||
loaderURL.toString(), | ||
require.resolve('./child-entrypoint.js'), | ||
`${argPrefix}${compress(state)}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Note to self: can we include state only once to reduce the risk we hitCommandLine
length limit on Windows?
Consider how this will look with forthcomingregister()
API. Ifregister()
will allow sending the state payload in-memory, akin to currentlateBind
logic, then consider using a pattern that looks similar today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yeah good call, I guess child-entrypoint should just pull it from the loader URL if it needs it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ah, so the tricky bit here is that the child-entrypoint is modifying and re-stashing the value in the state's argv for child processes, so storing itonly in the loader url execArgv involves a bit more refactoring. Can definitely be done, but I'd recommend putting it off for a second commit (if not second PR) just to ensure edge cases are handled properly.
Thinking through the double-checking scenario, I think you're right, but I don't think there's much to be done about it. The good news is, once source-returning commonjs loaders land in node, and ts-node starts using that instead, then the problem goes away (along with several others!), since Cleaned up where the |
The ideal until node supports CJS-via-loaders is that we use the message port to delegate all compilation into the loader thread. But that requires more work and it's not necessary to get this merged. |
I was actually going to suggest something similar, having been in this code a little bit now. Using the Then at least on node versions from 14.17 up, you could use the same approach for all loaders, whether they're running in a separate thread or not. It does add a tiny bit of unnecessary serialization overhead if loaders are already on the main thread and can just call the function directly, but not much. If the intermediate child process spawned by |
Does
I'm not too worried about that. We should use the same approach as with the
When will the port be absent? My understanding is that it's always present w/off-thread loaders. Are there any non-EOLed node versions that don't have it?
This seems not a big deal. Main and worker threads are making a blocking call to |
isaacs commentedMay 31, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
It is always present with off-thread loaders. But it seems I was completely mistaken about this, and while diagnostics_channelis synchronous, it doesn't (as yet) cross over to the loader thread, so you do still need to proxy through the globalPreload context.port, at least until we get |
Technically speaking, it's not present on 16.0 through 16.11, which doesn't EOL until September 2023, but in those cases we don't bother to use a globalPreload, so it's not an issue. |
GeoffreyBooth commentedMay 31, 2023
It is synchronous:nodejs/node#46826 |
jlenon7 commentedMay 31, 2023
He is right, |
For the sake of anyone else reading along: A combination of The goal is to make blocking calls from the main thread to the loader thread, using So it is possible today for Sounds like the question is whether bootstrapping must be async. Do we have to |
GeoffreyBooth commentedJun 1, 2023
At least according to the docs in the PR, |
Updated to remove line number source map tests for repl and eval. (If you'd rather take on the refactoring to try to get it to work, that's fine, but imo that should be a separate PR at least.) Added
Would be good to add some tests for |
This removes support for keeping import assertions, which were broken inswc at some point, and unconditionally transpiled into importattributes. (Ie, `import/with` instead of `import/assert`.)No version of node supports import attributes with this syntax yet, soanyone using swc to import json in ESM is out of luck no matter what.And swc 1.3.83 broke the option that ts-node was using. The position ofthe swc project is that experimental features are not supported, and maychange in patch versions without warning, making them unsafe to rely on(as evidenced here, and the reason why this behavior changedunexpectedly in the first place).Better to just not use experimental swc features, and let it removeimport assertions rather than transpile them into something that nodecan't run.Fix:TypeStrong#2056
As of node v20, loader hooks are executed in a separate isolated threadenvironment. As a result, they are unable to register the`require.extensions` hooks in a way that would (in other node versions)make both CJS and ESM work as expected.By adding a `globalPreload` method, which *does* execute in the mainscript environment (but with very limited capabilities), these hooks canbe attached properly, and `--loader=ts-node/esm` will once again makeboth cjs and esm typescript programs work properly.
When running `ts-node --esm`, a child process is spawned with the`child-loader.mjs` loader, `dist/child/child-entrypoint.js` main,and `argv[2]` set to the base64 encoded compressed configurationpayload.`child-loader.mjs` imports and re-exports the functions definedin `src/child/child-loader.ts`. These are initially set to emptyloader hooks which call the next hook in line until they aredefined by calling `lateBindHooks()`.`child-entrypoint.ts` reads the config payload from argv, andbootstraps the registration process, which then calls`lateBindHooks()`.Presumably, the reason for this hand-off is because `--loader`hooks do not have access to `process.argv`. Unfortunately, innode 20, they don't have access to anything else, either, socalling `lateBindHooks` is effectively a no-op; the`child-loader.ts` where the hooks end up getting bound is not thesame one that is being used as the actual loader.To solve this, the following changes are added:1. An `isLoaderThread` flag is added to the BootstrapState. If this flag is set, then no further processing is performed beyond binding the loader hooks.2. `callInChild` adds the config payload to _both_ the argv and the loader URL as a query param.3. In the `child-loader.mjs` loader, only on node v20 and higher, the config payload is read from `import.meta.url`, and `bootstrap` is called, setting the `isLoaderThread` flag.I'm not super enthusiastic about this implementation. Itdefinitely feels like there's a refactoring opportunity to cleanit up, as it adds some copypasta between child-entrypoint.ts andchild-loader.mjs. A further improvement would be to remove thelate-binding handoff complexity entirely, and _always_ pass theconfig payload on the loader URL rather than on process.argv.
When an error is thrown in the loader thread, it must be passed throughthe comms channel to be printed in the main thread.Node has some heuristics to try to reconstitute errors properly, butthey don't function very well if the error has a custom inspect method,or properties that are not compatible with JSON.stringify, so theTSErrors raised by the source transforms don't get printed in any sortof useful way.This catches those errors, and creates a new error that can go throughthe comms channel intact.Another possible approach would be to update the shape of the errorsraised by source transforms, but that would be a much more extensivechange with further reaching consequences.
Set the default `options.pretty` value based on stderr rather thanstdout, as this is where errors are printed.The loader thread does not get a process.stderr.isTTY set, because its"stderr" is actually a pipe. If `options.pretty` is not set explicitly,the GlobalPreload's `context.port` is used to send a message from themain thread indicating the state of stderr.isTTY.Adds `Service.setPrettyErrors` method to enable setting this value whenneeded.
The @cspotcode/source-map-support module does not function properly onNode 20, resulting in incorrect stack traces. Fortunately, the built-insource map support in Node is now quite reliable. This does have thefollowing (somewhat subtle) changes to error output:- When a call site is in a method defined within a constructor function, it picks up the function name *as well as* the type name and method name. So, in tests where a method is called and throws within the a function constructor, we see `Foo.Foo.bar` instead of `Foo.bar`.- Call sites displays show filenames instead of file URLs.- The call site display puts the `^` character under the `throw` rather than the construction point of the error object. This is closer to how normal un-transpiled JavaScript behaves, and thus somewhat preferrable, but isn't possible when all we have to go on is the Error stack property, so it is a change.I haven't been able to figure out why exactly, but the call sites appearto be somewhat different in the repl/eval contexts as a result of thischange. It almost seems like the @cspotcode/source-map-support wasapplying source maps to the vm-evaluated scripts, but I don't see howthat could be, and in fact, there's a comment in the code stating thatthat *isn't* the case. But the line number showing up in an Error.stackproperty is `1` prior to this change (matching the location in the TSsource) and is `2` afterwards (matching the location in the compiledJS).An argument could be made that specific line numbers are a bitmeaningless in a REPL anyway, and the best approach is to just makethose tests accept either result. One possible approach to providebuilt-in source map support for the repl would be to refactor the`appendCompileAndEvalInput` to diff and append the *input* TS, andcompile within the `runInContext` method. If the transpiled code wasprepended with `process.setSourceMapsEnabled(true);`, then Error stacksand call sites would be properly source mapped by Node internally.
This also adds a type for the loader hooks API v3, as globalPreload isscheduled for removal in node v21, at which point '--loader ts-node/esm'will no longer work, and '--import ts-node/import' will be the wayforward.
…g it in here to quiet the CI on other PRs
…g it in here to quiet the CI on other PRs
piotr-cz commentedAug 28, 2024
This fix (or at least |
As of node v20, loader hooks are executed in a separate isolated thread environment. As a result, they are unable to register the
require.extensions
hooks in a way that would (in other node versions) make both CJS and ESM work as expected.By adding a
globalPreload
method, whichdoes execute in the main script environment (but with very limited capabilities), these hooks can be attached properly, and--loader=ts-node/esm
will once again make both cjs and esm typescript programs work properly.