haskell/haskell-language-serverPublic

NotificationsYou must be signed in to change notification settings
Fork399
Star2.8k

Optimise module to filename#4600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Draft

guibou wants to merge11 commits intomaster

base:master

Choose a base branch

fromoptimise_module_to_filename

Draft

Optimise module to filename#4600

guibou wants to merge11 commits intomasterfromoptimise_module_to_filename

Conversation

Copy link

Collaborator

guibou commentedMay 26, 2025

(only latests commit are relevant, the other are from other MR which should be merged before this work is merged. Or we can rebase)

This is related to#4598.

This changes the file to module associating logic done during dependency
graph building.

Before, each time a moduleFoo.Bar is found, HLS is testing inside all
the import path for the existence of a relevant fiel.. It means that for
i import paths andm modules to locate,m * n filesystem
operations are done. Note also that this involves a lot of complex
string concatenation primitive to build theFilePath.

A module is tested for eachimport for each of the file of the
project. We also test forboot files, doubling the number of test.

In#4598 we have a project with1100 modules, in more than 250 import
paths and we count more than17000import statments, resulting on
over 6 millions test for file existences. This project was blocking for
more than 3 minutes during HLS startup.

This commit changes the way this is computed:

At startup, aMap ModuleName FilePath (the real type is a bit more
involved for performance, multiples unit and boot files handling) is
built by scanning all the import paths for files representing the
different modules.
Directory scanning is efficient and if import path only contains
haskell module, this will never do more job that listing the files of
the project.
The lookup is now simplify aMap lookup.

The performance improvement is as follows:

The number of IO operation is dramatically reduced, from multiples
millions to a few recursive directories listing.
A lot of the boilerplate of converting path had be removed.
TODO: add an RTS stats before / after with number of allocations
On my project, the graph building time is reduced from a few minutes
to 3s.

Limitations:

How to rebuild theMap if the content of one directory change?
If one directory is filled with millions of files which are not of
interested, performance can be damaged. TODO: add a diagnostic during
this phase so the user can learn about this issue.

Code status:

Thelookup is not fully restored, especially it does not include the
handling of home unit as well as reexport.
The initialisation phase is cached inside aTVar stored as a top
level identifier usingunsafePerformIO. This is to be improved.

soulomoonand others added8 commits

May 20, 2025 02:56

compute deps fingerprint only

824c508

Avoid unnecessary recompilation due to -haddock

d9b168a

Due to unprincipled adding and removing the `-haddock` flag duringcompilation and recompilation checking, we were performing more workthan necessary.We avoid this by compiling everything with `-haddock` by default. Thisis safe nowadays, we have essentially been doing this for many releases,and know this is fine.For the occasion where we actually want to parse without the `-haddock`flag, we keep explicitly disabling it.We enable `-haddock` flag during session loading, since we alreadyperform a number of DynFlags tweaks.This behaviour is dependent on the `OptHaddockParse` opton, which can,currently, only be modified at compile-time.

Fix windows test failure

2606976

Refactor comments for clarity and update rule usage in needsCompilati…

4cc1c37

…onRule

Merge branch 'master' into wip/T4443

fd903c4

Refactor imports in Handlers.hs for clarity and organization

8afa778

WIP: THIS SHOLUD NOT BE MERGED

fcd0ef7

This work is based on these two commits which should either be mergedbefore this is merged, or it should be rebased.

chore: introduce ghc 9.12.2 as dev shell (bump nixpgks for 9.12.2)

321e12e

guibou force-pushed theoptimise_module_to_filename branch fromab06e51 toa11ff5bCompare

May 26, 2025 13:56

guibou mentioned this pull request

May 27, 2025

rawDependencyInformation bad performance and memory usage#4598

Open

guibou added3 commits

May 27, 2025 08:23

refactor: WIP. Module name to filepath optimisation

51ad657

This is related to#4598.This changes the file to module associating logic done during dependencygraph building.Before, each time a module `Foo.Bar` is found, HLS is testing inside allthe import path for the existence of a relevant fiel.. It means that for`i` import paths and `m` modules to locate, `m * n` filesystemoperations are done. Note also that this involves a lot of complexstring concatenation primitive to build the `FilePath`.A module is tested for each `import` for each of the file of theproject. We also test for `boot` files, doubling the number of test.In#4598 we have a project with `1100` modules, in more than 250 importpaths and we count more than `17000` `import` statments, resulting onover 6 millions test for file existences. This project was blocking formore than 3 minutes during HLS startup.This commit changes the way this is computed:- At startup, a `Map ModuleName FilePath` (the real type is a bit more  involved for performance, multiples unit and boot files handling) is  built by scanning all the import paths for files representing the  different modules.- Directory scanning is efficient and if import path only contains  haskell module, this will never do more job that listing the files of  the project.- The lookup is now simplify a `Map` lookup.The performance improvement is as follows:- The number of IO operation is dramatically reduced, from multiples  millions to a few recursive directories listing.- A lot of the boilerplate of converting path had be removed.- TODO: add an RTS stats before / after with number of allocations- On my project, the graph building time is reduced from a few minutes  to 3s.Limitations:- How to rebuild the `Map` if the content of one directory change?- If one directory is filled with millions of files which are not of  interested, performance can be damaged. TODO: add a diagnostic during  this phase so the user can learn about this issue.Code status:- The `lookup` is not fully restored, especially it does not include the  handling of home unit as well as reexport.- The initialisation phase is cached inside a `TVar` stored as a top  level identifier using `unsafePerformIO`. This is to be improved.A note about performanceMost users won't see the benefits of these change, but I think theyapply to everbody:- We are still doing 1 lookup per `import` per module. But the lookup  result is not multiples IO, so it should be faster by a large amount.- Most project only have 1 (or a few) import paths so won't benefit as  dramatically as me from this.TODO for allocations

chore: force master branch of hie-bios

c3c06ec

It contains a fix which dramatically increase performance when doingcradle initialisation.haskell/hie-bios#464

wip: remove the normalized target

2d8386c

It does not seem to be used and apparently had never been used (after aquick look in the repository diff).Ask m.pickering about this.

guibou force-pushed theoptimise_module_to_filename branch froma11ff5b to2d8386cCompare

May 27, 2025 04:47

Copy link

CollaboratorAuthor

guibou commentedMay 28, 2025

I'm running this code in production on our large codebase for the 3rd day.

a) My users are super happy, a few quotes:

(it was more than 5 minutes before ;)

Ok, that's all for the shameless plug ;)

I think I observed a few reload / performance issues which are not related and would definitely do some followup issues. I'm investigating (this is difficult to describe right now. Looks like HLS is not able to give diagnostic back if it is typechecking a file which depends on the file I just modified and if this typechecking takes time).

I also observed a surprising behavior, I need to investigate if that's linked to my MR or to something else:

I had started HLS in a context where one of the dependencies (from the package db) was version X (which actually exposes a reexported symbol. e.g.Database.Selda which reexportsText).
I then had changed my working copy to a version of my code which uses a new version ofDatabase.Selda, without the reexport ofText, and my code was updated toimport Data.Text (Text) everywhere it was previously using the reexport.
From my editor, Irestartedhls
Warnings started to appear in HLS about redundant import forimport Data.Text (Text) (which was true with the previous version of Selda). Note that the warning appeared on files which are NOT open in the editor, but HLS got notified about the change and reevaluated them.
Opening the file in the editor, the warning disappears.

It looks like a multi-home issue, see.#3237, which would make sense considering that I had done nothing to fix multiple home. But this is still rather surprising because HLS was supposed to restart and hence should have completely ignored the old selda version. Maybe it picked some information from the cache.

Copy link

Collaborator

soulomoon commentedMay 28, 2025•
edited
Loading

Maybe it picked some information from the cache

We handle FOI(File of interest) and none-FOI differently.
SeegetModIfaceRule. This might be causing what you see@guibou

Copy link

CollaboratorAuthor

guibou commentedMay 29, 2025

We handle FOI(File of interest) and none-FOI differently.
See getModIfaceRule. This might be causing what you see@guibou

@soulomoon Thank you. This is indeed interesting. I will confirm that the bug I'm observing is not linked to my change and open an issue. This is not a dramatic issue (e.g. it is not everyday that you restart the editor with breaking changes in the packages from the package database), but it is not convenient and can be confusing.

Copy link

CollaboratorAuthor

guibou commentedMay 31, 2025

I've found a bug, when looking for module in the import path, an exception is raised if the import path is empty.

Labels

None yet

3 participants

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Optimise module to filename#4600

Are you sure you want to change the base?

Optimise module to filename#4600

Uh oh!

Conversation

guibou commentedMay 26, 2025

Uh oh!

guibou commentedMay 28, 2025

Uh oh!

soulomoon commentedMay 28, 2025•
edited
Loading

Uh oh!

Uh oh!

guibou commentedMay 29, 2025

Uh oh!

guibou commentedMay 31, 2025

Uh oh!

Uh oh!

Movatterモバイル変換

Uh oh!

Optimise module to filename#4600

Are you sure you want to change the base?

Optimise module to filename#4600

Uh oh!

Conversation

guibou commentedMay 26, 2025

Uh oh!

guibou commentedMay 28, 2025

Uh oh!

soulomoon commentedMay 28, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

guibou commentedMay 29, 2025

Uh oh!

guibou commentedMay 31, 2025

Uh oh!

Uh oh!

soulomoon commentedMay 28, 2025•
edited
Loading