- Notifications
You must be signed in to change notification settings - Fork12.9k
Write path normalization without array allocations#60812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
@typescript-bot perf test this |
typescript-bot commentedDec 18, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
@@ -624,27 +624,103 @@ export function getNormalizedPathComponents(path: string, currentDirectory: stri | |||
} | |||
/** @internal */ | |||
export function getNormalizedAbsolutePath(fileName: string, currentDirectory: string | undefined): string { | |||
return getPathFromPathComponents(getNormalizedPathComponents(fileName, currentDirectory)); | |||
export function getNormalizedAbsolutePath(path: string, currentDirectory: string | undefined): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I wonder ifcurrentDirectory
is almost always normalized.
@andrewbranch Here they are:tscComparison Report - baseline..pr
System info unknown Hosts
Scenarios
tsserverComparison Report - baseline..pr
System info unknown Hosts
Scenarios
startupComparison Report - baseline..pr
System info unknown Hosts
Scenarios
Developer Information: |
src/compiler/path.ts Outdated
const root = path.substring(0, rootLength); | ||
const normalizedRoot = root && normalizeSlashes(root); | ||
// `normalized` is only initialized once `path` is determined to be non-normalized | ||
let normalized = normalizedRoot === root ? undefined : normalizedRoot; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
You should just uselastIndexOf
to see ifaltDirectorySeparator
occurs. That way you can avoid creating a substring forroot
at all.
I’m puzzled by the perf results |
Changing the scenarios to have long paths that are already normalized shows the old |
@typescript-bot perf test |
typescript-bot commentedDec 19, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
src/compiler/path.ts Outdated
const sepIndex = path.indexOf(directorySeparator, index + 1); | ||
const altSepIndex = path.indexOf(altDirectorySeparator, index + 1); |
DanielRosenwasserDec 19, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is quadratic because you have to find at least one of the other kinds of slashes over and over. You should just tight-loop on!isAnyDirectorySeparator
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
or usefindIndex(path, isAnyDirectorySeparator, index + 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
My only concern withfindIndex
is that it might not optimize well it passed in both arrays and strings. Maybe I'm superstitious.
src/compiler/path.ts Outdated
// At beginning of segment | ||
segmentStart = index; | ||
let ch = path.charCodeAt(index); | ||
while (isAnyDirectorySeparator(ch) && index + 1 < path.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Took me a bit to see why this works, but it's becauseindex
starts immediately after the root, or after the first separator following the prior segment. Figuring out if one of those is a backslash is handled below.
normalizedUpTo = index + 2; | ||
} | ||
} | ||
else if (normalized === undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think that in this branch it istechnically possible for you to avoid extra substrings by adjustingnormalizedUpTo
instead of initializingnormalized
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'm not sure that it matters since you'd still need to allocate a substring when returning.
@andrewbranch Here they are:tscComparison Report - baseline..pr
System info unknown Hosts
Scenarios
tsserverComparison Report - baseline..pr
System info unknown Hosts
Scenarios
startupComparison Report - baseline..pr
System info unknown Hosts
Scenarios
Developer Information: |
// the root is `file://Users/` | ||
assert.strictEqual(ts.getNormalizedAbsolutePath("file://Users/matb/projects/san/../../../../../../typings/@epic/Core.d.ts", ""), "file://Users/typings/@epic/Core.d.ts"); | ||
// this is real | ||
assert.strictEqual(ts.getNormalizedAbsolutePath("file:///Users/matb/projects/san/../../../../../../typings/@epic/Core.d.ts", ""), "file:///typings/@epic/Core.d.ts"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
idk if we want to preserve specific examples in these, but that might be fine.
Maybe we should have tests onuntitled:
?
src/compiler/path.ts Outdated
return path; | ||
} | ||
// Some paths only require cleanup of `/./` or leading `./` | ||
const simplified = path.replace(/\/\.\//g, "/").replace(/^\.\//, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
If we're already here, I'd also recommend
constsimplified=path.replace(/\/\.\//g,"/").replace(/^\.\//,""); | |
letsimplified=path.replace(/\/\.\//g,"/"); | |
if(simplified.startsWith("./"){ | |
simplified=simplified.slice(2); | |
} |
andrewbranchJan 9, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is indeed a small but measurable improvement. Nice!
Uh oh!
There was an error while loading.Please reload this page.
normalized = path.substring(0, normalizedUpTo); | ||
} | ||
} | ||
else if (segmentLength === 2 && path.charCodeAt(index) === CharacterCodes.dot && path.charCodeAt(index + 1) === CharacterCodes.dot) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Minor nit: If the previoussegmentLength
was1
, but the segment was not.
, this performs an unnecessary recheck ofsegmentLength === 2
.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
src/compiler/path.ts Outdated
const sepIndex = path.indexOf(directorySeparator, index + 1); | ||
const altSepIndex = path.indexOf(altDirectorySeparator, index + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
or usefindIndex(path, isAnyDirectorySeparator, index + 1)
andrewbranch commentedDec 20, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
From instrumenting the compilation of VS Code, I learned two things:
Optimizing I'm going to repeat the instrumentation for a TS Server program update and see if the ratios are similar. |
During a TS Server update,literally every call is a no-op.
|
@typescript-bot pack this |
typescript-bot commentedJan 8, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Hey@DanielRosenwasser, something went wrong when looking for the build artifact. (You can check the log here). |
@typescript-bot pack this |
typescript-bot commentedJan 8, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
typescript-bot commentedJan 8, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Hey@DanielRosenwasser, I've packed this intoan installable tgz. You can install it for testing by referencing it in your
and then running There is also a playgroundfor this build and annpm module you can use via |
andrewbranch commentedJan 9, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
|
After latest commit:
and just comparing the last commit to the one before it:
|
@typescript-bot perf test |
typescript-bot commentedJan 9, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
@andrewbranch Here they are:tscComparison Report - baseline..pr
System info unknown Hosts
Scenarios
tsserverComparison Report - baseline..pr
System info unknown Hosts
Scenarios
startupComparison Report - baseline..pr
System info unknown Hosts
Scenarios
Developer Information: |
path = normalizeSlashes(path); | ||
} | ||
const simpleNormalized = simpleNormalizePath(path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
simpleNormalizePath
also callsnormalizeSlashes
on the first line of the function, which is redundant. I'd suggest you remove the line fromsimpleNormalizePath
and just ensure its other callers normalize slashes before calling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ah yeah, that was my intention, good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The benchmarks I shared already reflect that; I just made a copy/paste error updating this PR.
e973805
intomicrosoft:mainUh oh!
There was an error while loading.Please reload this page.
Thanks!! |
Sweet. Thanks! We noticed this in Deno in TS 5.7 and I was just about to investigate opening a PR to fix it. |
Uh oh!
There was an error while loading.Please reload this page.
Related:#60633
Alternative to#60755
Despite the good benchmarks, I didn’t see any measurable impact in real-world
updateGraphWorker
duration on my M2 Mac Mini. However, since that was a GC hot spot for@dbaeumer and#60755 improved things some, this may do even better for him.bench.js