Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Use Eio for parallelizing some analysis commands#7840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
zth wants to merge18 commits intomaster
base:master
Choose a base branch
Loading
fromanalysis-eio

Conversation

@zth
Copy link
Member

@zthzth commentedSep 4, 2025
edited
Loading

This introduces OCaml multicore (Eio) to theanalysis bin, and leverages it for 2 of the editor tooling commands:

  • rename
  • Find all references

Benching this on my local machine, in a repo of ~350k lines of ReScript and ~1800 files, both commands end up about 2.5x faster.

mediremi, fhammerschmidt, and cknitt reacted with rocket emoji
@zthzth requested a review fromCopilotSeptember 4, 2025 10:29

This comment was marked as outdated.

Copy link
Member

@nojafnojaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Exciting stuff! Do you have any good resources on Eio?
I looked at it a while ago and didn't really find any beginner friendly stuff on it. Themain repo was an overwhelming read for me.

I hope we can also integrateeio-trace andopentelemetry at some point.

print_endline
(if allLocs=[]thenProtocol.null
else"[\n"^ (allLocs|>String.concat",\n")^"\n]")
Eio_main.run (funenv ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Would a future goal be to have thisEio_main at the main entry point level of analysis tool?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yup! I moved it there now as well, but just for the relevant sub commands.


letforLocalStamp~full:{file; extra; package}stamp (tip: Tip.t)=
(* Single helper for parallel work distribution over a list.*)
letparallel_map~domain_mgr~items~f=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A little wild that this isn't somewhere part of the Eio library.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Maybe it is. This one sprung from trying to tune the exact use case in this PR a bit. If something better exists we can switch to that as we uncover more types of "loads" as we expand using this.

@pkg-pr-new
Copy link

pkg-pr-newbot commentedSep 5, 2025
edited
Loading

Open in StackBlitz

rescript

npm i https://pkg.pr.new/rescript-lang/rescript@7840

@rescript/darwin-arm64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/darwin-arm64@7840

@rescript/darwin-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/darwin-x64@7840

@rescript/linux-arm64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/linux-arm64@7840

@rescript/linux-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/linux-x64@7840

@rescript/runtime

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/runtime@7840

@rescript/win32-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/win32-x64@7840

commit:4d74fcc

Comment on lines -61 to -66
# Verify that the compiler still builds with the oldest OCaml version we support.
-os:ubuntu-24.04
ocaml_compiler:ocaml-variants.4.14.2+options,ocaml-option-static
node-target:linux-x64
rust-target:x86_64-unknown-linux-musl

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This was discussed in Discord that it's OK for us to move to5.3.0+.

nojaf and cknitt reacted with hooray emoji
dune-project Outdated
(synopsis"ReScript compiler")
(depends
(ocaml
(>=5.3))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Why do we need that here now?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Turns out we don't. Removed.

run:opam exec -- dune build --display quiet --profile static
run:|
arch=$(dpkg-architecture -qDEB_HOST_MULTIARCH)
C_INCLUDE_PATH="/usr/include:/usr/include/$arch" CPATH="/usr/include:/usr/include/$arch" opam exec -- dune build --display quiet --profile static
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Rather than duplicating these env var definitions, maybe we could have a single step that sets them in GITHUB_ENV?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Fixed.

@zthzth changed the title[WIP] Use Eio for parallelizing some analysis commandsUse Eio for parallelizing some analysis commandsSep 5, 2025
@zthzth marked this pull request as ready for reviewSeptember 5, 2025 10:35
Copy link
Contributor

CopilotAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Pull Request Overview

This PR introduces OCaml multicore (Eio) parallelization to the ReScript analysis bin, targeting therename andfind all references commands for approximately 2.5x performance improvements on large codebases.

Key changes:

  • Updates OCaml version requirement from 4.14 to 5.3 across all packages
  • Adds Eio dependencies for multicore support
  • Implements parallel processing for reference finding operations
  • Adds thread-safe access to shared state using mutexes

Reviewed Changes

Copilot reviewed 13 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
FileDescription
dune-projectUpdates OCaml version requirement and adds Eio dependencies
*.opam filesGenerated package files with updated OCaml version and Eio deps
analysis/src/duneAdds Eio library dependency
analysis/src/SharedTypes.mlAdds StateSync module with mutex for thread-safe access
analysis/src/References.mlImplements parallel_map function and updates reference finding to use multicore
analysis/src/ProcessCmt.mlAdds thread-safe caching with double-checked locking
analysis/src/Packages.mlUpdates package retrieval functions to use thread-safe access
analysis/src/Commands.mlUpdates command functions to accept Eio environment and use parallel operations
.github/workflows/ci.ymlUpdates CI to use OCaml 5.3 and removes 4.14 compatibility check
CHANGELOG.mdDocuments the new multicore feature

Tip: Customize your code reviews with copilot-instructions.md.Create the file orlearn how to get started.

Comment on lines 445 to 448
if doms<=1|| len< small_thresholdthen f items
else
let chunk_count= min doms lenin
let chunk_size= max1 ((len+ chunk_count-1)/ chunk_count)in
Copy link

CopilotAISep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The magic number10 forsmall_threshold should be extracted to a named constant or configuration parameter to improve maintainability and make the threshold adjustable.

Copilot uses AI. Check for mistakes.
if:runner.os == 'Linux'
uses:awalsh128/cache-apt-pkgs-action@v1.4.3
with:
# See https://github.com/ocaml/setup-ocaml/blob/b2105f9/packages/setup-ocaml/src/unix.ts#L9
Copy link

CopilotAISep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The addition oflinux-libc-dev dpkg-dev packages should include a comment explaining why these specific packages are needed for the OCaml 5.3/Eio build requirements.

Copilot uses AI. Check for mistakes.
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can you fix this Copilot?

Copy link
Member

@nojafnojaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks good to me, I don't know enough abouteio, so I will let the others do the approval here.

| [_;"format"; path] ->
Printf.printf"\"%s\"" (Json.escape (Commands.format~path))
| [_;"test"; path] ->Commands.test~path
| [_;"test"; path] ->Eio_main.run (funenv ->Commands.test~env~path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is used because reference tests require theenv now right?
The tests themselves are still sequential, right?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The specific tests of commands that use Eio will still use it in the tests as well, but yeah, the tests themselves are sequential.

nojaf reacted with thumbs up emoji
@cknitt
Copy link
Member

Could you address Copilot's comments? Other than that, looks good to me! 👍

Copy link
Collaborator

@cristianoccristianoc left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I thought it was all straightforward until I saw the locks.
Now the question:
why did you add locks and how do you know that we don't have deadlocks (locking too much) or data races (some locks are still missing)

Worth spending some time of this ahead of time before getting into bugs that are very hard to reproduce.

~pos:(int_of_string line, int_of_string col)
~debug
Eio_main.run (funenv ->
Commands.references~env~path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

why does references need to take an env now and not before?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Because it's using Eio, and that requires the Eio env.

@zth
Copy link
MemberAuthor

zth commentedSep 6, 2025

I thought it was all straightforward until I saw the locks. Now the question: why did you add locks and how do you know that we don't have deadlocks (locking too much) or data races (some locks are still missing)

Worth spending some time of this ahead of time before getting into bugs that are very hard to reproduce.

There were races in accessing/writing to the cmt cache, that caused random failures of the command. I'll do another round of reasoning about the locks etc.

@cristianoc
Copy link
Collaborator

I thought it was all straightforward until I saw the locks. Now the question: why did you add locks and how do you know that we don't have deadlocks (locking too much) or data races (some locks are still missing)
Worth spending some time of this ahead of time before getting into bugs that are very hard to reproduce.

There were races in accessing/writing to the cmt cache, that caused random failures of the command. I'll do another round of reasoning about the locks etc.

How about making sure that transitively no mutable global state is accessed? Is that too hard to establish?

@zth
Copy link
MemberAuthor

zth commentedSep 6, 2025

How about making sure that transitively no mutable global state is accessed? Is that too hard to establish?

No that's probably a great idea. I'll give it a shot! It's great if we figure these things out now, because this type of Eio thing will hopefully be the foundation of quite a few things going forward.

@cristianoc
Copy link
Collaborator

How about making sure that transitively no mutable global state is accessed? Is that too hard to establish?

No that's probably a great idea. I'll give it a shot! It's great if we figure these things out now, because this type of Eio thing will hopefully be the foundation of quite a few things going forward.

So no need for locks and things would be much safer.
Though: one needs to be careful not to introduce a slowdown.

@zth
Copy link
MemberAuthor

zth commentedSep 6, 2025

@cristianoc4352708 removes mutexes in favor of domain local caches. No meaningful regression in perf from doing this. But can't help to wonder if there's a better way.

We'll of course continue improving on this as we integrate more features with the same functionality. This first version needs to be safe and clear/understandable though of course.

@@ -0,0 +1,7 @@
(* Helpers for domain-local caches*)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Any links to what a domain-local cache even is?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Domains in Eio are essentially threads but managed by Eio. So, a domain local cache is a cache that's valid just for a thread (domain).

Previously, everything was sequential and within a single thread. We can now use multiple threads (domains) for parallelizing work, but that means interacting with the cache isn't safe if it's shared between all threads (races etc). So, by doing domain local caches, we dodge that issue.

For the exact tasks in this PR (rename and find references), the cache doesn't matter that much I suspect. But since it was already in place, and it's used in a bunch of other places (where we don't use Eio yet, and maybe won't ever do), it made sense to make them domain specific.

This should be "backwards compatible" too from what I understand, in that the cache will work the same way as it did before for the non-Eio stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks a bunch for explaining this!

zth reacted with thumbs up emoji
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Domains in Eio are essentially threads but managed by Eio. So, a domain local cache is a cache that's valid just for a thread (domain).

Previously, everything was sequential and within a single thread. We can now use multiple threads (domains) for parallelizing work, but that means interacting with the cache isn't safe if it's shared between all threads (races etc). So, by doing domain local caches, we dodge that issue.

For the exact tasks in this PR (rename and find references), the cache doesn't matter that much I suspect. But since it was already in place, and it's used in a bunch of other places (where we don't use Eio yet, and maybe won't ever do), it made sense to make them domain specific.

This should be "backwards compatible" too from what I understand, in that the cache will work the same way as it did before for the non-Eio stuff.

Can you try something: create intentionally some shared mutable state and use domain local cache and check that they don't interfere.
Maybe in a file of only a few lines.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@nojafnojafnojaf left review comments

@cristianoccristianoccristianoc left review comments

Copilot code reviewCopilotCopilot left review comments

@cknittcknittAwaiting requested review from cknitt

At least 1 approving review is required to merge this pull request.

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

5 participants

@zth@cknitt@cristianoc@nojaf

[8]ページ先頭

©2009-2025 Movatter.jp