- Notifications
You must be signed in to change notification settings - Fork768
New auto-import infrastructure#2390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Pull request overview
This PR introduces a completely redesigned auto-import collection infrastructure with improved parallelization and caching. The main changes include:
- Complete rewrite of auto-import data collection and caching infrastructure
- New granular bucket-based system for storing exports (separate for projects and node_modules)
- Pre-computed module specifiers for node_modules packages
- Highly parallelized collection using mock programs instead of full type checking
- Movement of auto-import code from
internal/lstointernal/ls/autoimport
Reviewed changes
Copilot reviewed 186 out of 186 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
internal/tspath/extension.go | Exports previously private extension lists and improvesChangeAnyExtension function |
internal/testutil/projecttestutil/projecttestutil.go | AddsSetupWithRealFS andToPath utilities for testing |
internal/testutil/autoimporttestutil/fixtures.go | New comprehensive test fixtures for auto-import lifecycle testing |
internal/project/snapshotfs.go | AddssourceFS type for tracking file access during compilation |
internal/project/snapshot.go | Integrates auto-import registry with snapshot lifecycle |
internal/project/session.go | Adds auto-import preparation and warming logic |
internal/project/autoimport.go | New host implementation for auto-import registry |
internal/project/dirty/mapbuilder.go | New utility for copy-on-write map updates |
internal/project/dirty/map.go | AddsReplace andTryDelete methods |
internal/project/compilerhost.go | Refactors to usesourceFS for file tracking |
internal/parser/references.go | Fixes ambient module name collection |
internal/packagejson/*.go | Adds utility methods for iterating dependencies |
internal/modulespecifiers/*.go | Extracts and exports utility functions, adds entrypoint processing |
internal/module/*.go | Moves shared utilities and adds entrypoint resolution |
internal/lsp/server.go | Adds auto-import request handler and logging |
internal/lsp/lsproto/lsp_generated.go | Updates auto-import protocol structures |
internal/ls/*.go | Major refactoring to use new auto-import infrastructure |
internal/ls/autoimport/*.go | New auto-import package with registry, view, and fix generation |
Uh oh!
There was an error while loading.Please reload this page.
jakebailey commentedDec 16, 2025
IIRC this isn't desirable because then you won't get auto imports in things like build scripts or tests. (I have not read the PR quite yet.) |
andrewbranch commentedDec 16, 2025
I think it’s desirable, because you edit build scripts a lot less than you edit your app code! This is how it always worked in Strada. Almost every single one of our users has |
jakebailey commentedDec 16, 2025
Ah, sure, I guess I'm not sure how I've been getting fairly good auto imports in say, Herebyfile |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
| entrypointsMu.Unlock() | ||
| aliasResolver:=createAliasResolver(packageName,packageEntrypoints.Entrypoints) | ||
| checker,_:=checker.NewChecker(aliasResolver) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
How often does this happen? Creating checkers is quite expensive, technically, so I assume not often
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The only thing expensive about creating checkers is global symbol merges, but part of the point of feeding it with the alias resolver is seeding the checker with only the package’s entrypoint files, so very very few globals if any (typically just ambient modules declared by the entrypoints). This is very fast. On my machine, creating fewer checkers than processors was measurably slower.
Uh oh!
There was an error while loading.Please reload this page.
| } | ||
| funccreateCheckerPool(program checker.Program) (getCheckerfunc() (*checker.Checker,func()),closePoolfunc(),getCreatedCountfunc()int32) { | ||
| maxSize:=int32(runtime.GOMAXPROCS(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can we limit this? Or do these actually do so little as to not matter? On my machines that's something like 16 or 20 checkers, which would normally scare me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Again, limiting this was measurably slower on my machine on projects I tried (including VS Code, which is a single large program)
| deferstop() | ||
| iferr:=s.Run(ctx);err!=nil { | ||
| fmt.Fprintln(os.Stderr,err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I haven’t looked at#2336 yet, but this could be what’s missing to show a stack trace.
| } | ||
| func (c *Checker) errorNoModuleMemberSymbol(moduleSymbol *ast.Symbol, targetSymbol *ast.Symbol, node *ast.Node, name *ast.Node) { | ||
| if c.compilerOptions.NoCheck.IsTrue() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Auto-imports uses checkers for minimal alias resolution, and these code paths go way down a rabbit hole hitting the file system and using the node builder to generate errors. Added some targeted short circuits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Most of the added failures are differing in a way that I think is unobjectionable or even preferable. I mostly focused on not regressing, so I haven’t looked at all of the tests that were previously failing and still are. I think some follow-up to categorize/address all the remaining ones would be good.
| forcondition,export:=rangeexports.AsObject().Entries() { | ||
| ifexcludeConditions!=nil&&excludeConditions.Has(condition) { | ||
| continue | ||
| } | ||
| conditionAlwaysMatches:=condition=="default"||condition=="types"||IsApplicableVersionedTypesKey(condition) | ||
| newIncludeConditions:=includeConditions | ||
| if!(conditionAlwaysMatches) { | ||
| newIncludeConditions=includeConditions.Clone() | ||
| excludeConditions=excludeConditions.Clone() | ||
| ifnewIncludeConditions==nil { | ||
| newIncludeConditions=&collections.Set[string]{} | ||
| } | ||
| newIncludeConditions.Add(condition) | ||
| for_,prevCondition:=rangeprevConditions { | ||
| ifexcludeConditions==nil { | ||
| excludeConditions=&collections.Set[string]{} | ||
| } | ||
| excludeConditions.Add(prevCondition) | ||
| } | ||
| } | ||
| prevConditions=append(prevConditions,condition) | ||
| loadEntrypointsFromTargetExports(subpath,newIncludeConditions,excludeConditions,export) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think this is neat: in order to build a node_modules cache that works for any file in any project, regardless ofcustomConditions or the file’s resolution mode, we walk through every conditional export of a package, and alongside the file the entrypoint corresponds to, we record the conditions you wouldneed to have to get there (as we descend into a conditions object) and the conditions you wouldneed not to have (as iterate past a previous condition in the same object). Then, when actually computing completions, we can iterate through these candidate entrypoints and compare the include/exclude sets against the reality for the project and file being queried (ls/autoimport/specifiers.go). Example:
{"exports": {".": {"node": {"custom":"./custom.d.ts","import":"./esm.d.mts","default":"./cjs.d.cts" } } }}- To offer auto-imports from
custom.d.ts, the importing file must havenodeandcustom - To offer auto-imports from
esm.d.mts, the importing file must havenodeandimportand mustnot havecustom - To offer auto-imports from
cjs.d.cts, the importing file must havenodeand mustnot havecustomorimport.
This is one of the strategy changes from Strada that lets us build a cache that can be used by more than one file.
testdata/baselines/reference/fourslash/autoImports/autoImportErrorMixedExportKinds.baseline.mdShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
andrewbranch commentedDec 17, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
jakebailey commentedDec 18, 2025
andrewbranch commentedDec 18, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
I’ll check out the example in the screenshot. The good one is probably from discovering the symlink in node_modules, while the bad ones probably originated from the realpath file in the program. (I’m not sure why the one with src is duplicated though, that’s kinda weird.) Interestingly, I think the solution is disable the newly lit up symlink cache in module specifier generation for program-sourced files 🙃
I forgot to mention this behavior difference I intentionally made because it’s pretty obscure and in the weeds. In completions, we typically try to deduplicate re-exports of the same symbol and just pick the best one, mostly based on module specifier. When two module specifiers tie in ranking, like
|
andrewbranch commentedDec 19, 2025
Hm, ok, yeah, symlinked monorepo packages are going to take more thought than I previously thought. I guess the good news is that the current failure mode is that you might get too many suggestions. The bad news is that the ones that come from node_modules might get stale as you edit your local packages, since we won't figure out from the watch event on the realpath that something stored in the node_modules bucket changed. Getting my thoughts out in writing since I may not have time to finish this out before I’m gone for the holidays: The current way we include node_modules packages that would normally be filtered out by your package.json dependencies, but are directly imported by your program files, is when we iterate over program files during project bucket building, if the file is in node_modules, we only include it if the node_modules bucket corresponding to the node_modules directory the file is in isnot going to include it (we've already computed the set of dependency names for any node_modules buckets that are being built concurrently). If the node_modules bucket doesn't have that package in its dependencies, we just process the file as part of the project bucket. This has pros and cons:
I think it might be a better idea to invert the current logic, and instead of precomputing the node_modules bucket dependencies so we can add extra node_modules files to project buckets, we should precompute the node_modules packages referenced by active programs and add those to the dependency list for those node_modules buckets. I think that will make everything else more consistent, and node_modules files will truly always be in node_modules buckets. The dependency computation is slightly more complicated, but there's not this weird exception to the rule of where exports are indexed. Either way, when we go through the program files to find resolved node_modules packages, we need to check to see if non-node_modules file names are actually the realpath of a node_modules package so we can know to let the node_modules buckets handle them. Additionally, when we handle a file change event for one of those files, we need to mark the node_modules bucket dirty, not the project bucket dirty. |


Uh oh!
There was an error while loading.Please reload this page.
This PR introduces a completely redesigned auto-import collection infrastructure. As we ported Strada's auto-import code to Corsa, two things became clear:
We wanted to address these issues before (or while) adding package.json-based dependency discovery, the main feature missing from the initial port of Strada's auto-imports code.
While the data collection backend has been completely redesigned, the fix generation code is mostly unchanged, although the files have been moved from
internal/lsintointernal/ls/autoimport.I’m closing existing all bugs/crashes filed against this repo’s auto-imports implementation with this PR. Please comment if you have a bug that still repros the exact same way after this PR.
Strada's collection and cache infrastructure
The earliest version of auto-imports in TypeScript had no caching or separate collection phase. During every global completions request, we used the type checker to gather every export from every source file in the program. This was actually fairly fast, in part because we deferred computing the actual import module specifier until a completion item was highlighted. Eventually, we wanted to address two common complaints: (1) users wanted to see the import module specifier for every item in the completion list without having to highlight it first, and (2) we would often suggest imports from transitive dependencies, bloating the list with entries that should really never be imported. Computing module specifiers for every item in the list slowed things down enough that we implemented a caching layer to compensate. Since the cache included module specifiers and filtering based on what dependencies were accessible to the file requesting completions, much of the cache was specific to the requesting file, and needed to be rebuilt whenever the user started editing a different file.
Later, we added a very commonly requested feature: auto-imports from package.json dependencies. Since the data source for auto-imports was source files already in the program, a common complaint was that immediately after
npm installing a dependency, auto-imports from that package wouldn't show up. Users would have to manually write an import from the new package once, then auto-imports would start appearing, even in other files. To address this without completely overhauling all the existing infrastructure, which was designed to use a type checker to extract exports from source files in a program, we added a separate program and type checker “in the background” (scare quotes because we couldn’t actually do anything in a separate process; we just added it onto project loading) containing entrypoints from package.json dependencies that were absent from the main program. Then, the language service could basically just run its existing auto-import collection code twice: once for the main program, and once for the package.json dependencies program. Since this had the potential to add significant amounts of work both to project loading and completions requests, we added a conservative bail-out if more than ten entrypoints were going to be loaded into the background program. This bail-out strategy was a big and imprecise hammer, since those ten entrypoints could trigger an arbitrarily large or small amount of work through transitive imports that would be eventually be filtered out.In short, auto-imports was due for a rewrite even before typescript-go.
New infrastructure
The primary innovations of the new system are:
Granular buckets
The new infrastructure stores exports in separate buckets that are independent of each other and independent of the requesting file. The
autoimport.Registrycontains one bucket for each project owning the editor's open files, plus one bucket for eachnode_modulesdirectory visible to the open files. Project buckets contain exports from the project's local, non-node_modulesfiles. This means that when switching between open files, any buckets that are already populated can be reused. When a project updates, allnode_modulesbuckets are unaffected; when annpm installhappens, only thenode_modulesdirectories that were touched need to be updated.Highly parallelized collection
Previously, collection of exports for package.json dependencies worked by creating a program containing all the entrypoints we care about, then using the type checker to extract exports from every source file in that program. The disadvantage of creating a program for this purpose is that we would resolve and parse transitive dependencies that have absolutely no impact on the exports we actually care about. In the new infrastructure, we instead create a mock program (
autoimport.aliasResolver) for each dependency package, seeded with just the entrypoint files for that package. We then walk the exports of those entrypoints, and only use a checker when one of those exports is a re-export from another module. The result is that we only process exactly what we need, we limit our reliance on type checkers, and when we do need a type checker, the data backing it is much smaller, so we can create many checkers very cheaply.Pre-computed module specifiers for
node_modulesSince the old infrastructure originally started from a program's source files, module specifier generation always went through the full process of “importing file name + destination file name => module specifier.” That process could be complicated for node_modules files, since the destination file could be the realpath of a symlinked package, or could have been a transitive import that’s not actually reachable from the importing file, or could be blocked by the package’s package.json
"exports". Without knowing how that file got into the program, there were a dozen edge cases that needed to be considered to reverse engineer the correct module specifier, or even to determine whether a valid one exists at all. With the new directory-based buckets and extraction only from dependency entrypoints, we actually know the module specifier for every entrypoint as we find it, so we just store it alongside the export information.Limitations / observable differences compared to Strada
sourceFile.Symbol.Exportsinstead ofchecker.GetExportsAndPropertiesOfModule, properties ofexport =assignments are not collected. Many of these properties were filtered out in the old logic because they were highly suspicious anyway (e.g. class instance methods, properties of arrays). There is a special syntactic carve-out for JavaScriptmodule.exports = { foo, bar, ... }patterns, which are still collected. This limitation did not cause any new test failures.export declare const foo: SomeTypewould appear as a function ifSomeTypedeclared call signatures, but now appears as a variable.Performance
It's extremely hard to compare this to
mainto Strada, because each has its own limitations that results in very different behavior, not just different timing characteristics:That said, here are some numbers on two example projects.
Signal-Desktop
VS Code
Follow-up work
completionItem/resolveis surprisingly slow (not just for auto-imports), and may be missing info like JSDoc that was previously(?) there for auto-imports