Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Optimise module to filename#4600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
guibou wants to merge14 commits intomaster
base:master
Choose a base branch
Loading
fromoptimise_module_to_filename
Draft

Conversation

@guibou
Copy link
Collaborator

(only latests commit are relevant, the other are from other MR which should be merged before this work is merged. Or we can rebase)

This is related to#4598.

This changes the file to module associating logic done during dependency
graph building.

Before, each time a moduleFoo.Bar is found, HLS is testing inside all
the import path for the existence of a relevant fiel.. It means that for
i import paths andm modules to locate,m * n filesystem
operations are done. Note also that this involves a lot of complex
string concatenation primitive to build theFilePath.

A module is tested for eachimport for each of the file of the
project. We also test forboot files, doubling the number of test.

In#4598 we have a project with1100 modules, in more than 250 import
paths and we count more than17000import statments, resulting on
over 6 millions test for file existences. This project was blocking for
more than 3 minutes during HLS startup.

This commit changes the way this is computed:

  • At startup, aMap ModuleName FilePath (the real type is a bit more
    involved for performance, multiples unit and boot files handling) is
    built by scanning all the import paths for files representing the
    different modules.
  • Directory scanning is efficient and if import path only contains
    haskell module, this will never do more job that listing the files of
    the project.
  • The lookup is now simplify aMap lookup.

The performance improvement is as follows:

  • The number of IO operation is dramatically reduced, from multiples
    millions to a few recursive directories listing.
  • A lot of the boilerplate of converting path had be removed.
  • TODO: add an RTS stats before / after with number of allocations
  • On my project, the graph building time is reduced from a few minutes
    to 3s.

Limitations:

  • How to rebuild theMap if the content of one directory change?
  • If one directory is filled with millions of files which are not of
    interested, performance can be damaged. TODO: add a diagnostic during
    this phase so the user can learn about this issue.

Code status:

  • Thelookup is not fully restored, especially it does not include the
    handling of home unit as well as reexport.
  • The initialisation phase is cached inside aTVar stored as a top
    level identifier usingunsafePerformIO. This is to be improved.

soulomoon reacted with rocket emoji
@guibou
Copy link
CollaboratorAuthor

I'm running this code in production on our large codebase for the 3rd day.

a) My users are super happy, a few quotes:

image

(it was more than 5 minutes before ;)

image

image

Ok, that's all for the shameless plug ;)

I think I observed a few reload / performance issues which are not related and would definitely do some followup issues. I'm investigating (this is difficult to describe right now. Looks like HLS is not able to give diagnostic back if it is typechecking a file which depends on the file I just modified and if this typechecking takes time).

I also observed a surprising behavior, I need to investigate if that's linked to my MR or to something else:

  • I had started HLS in a context where one of the dependencies (from the package db) was version X (which actually exposes a reexported symbol. e.g.Database.Selda which reexportsText).
  • I then had changed my working copy to a version of my code which uses a new version ofDatabase.Selda, without the reexport ofText, and my code was updated toimport Data.Text (Text) everywhere it was previously using the reexport.
  • From my editor, Irestartedhls
  • Warnings started to appear in HLS about redundant import forimport Data.Text (Text) (which was true with the previous version of Selda). Note that the warning appeared on files which are NOT open in the editor, but HLS got notified about the change and reevaluated them.
  • Opening the file in the editor, the warning disappears.

It looks like a multi-home issue, see.#3237, which would make sense considering that I had done nothing to fix multiple home. But this is still rather surprising because HLS was supposed to restart and hence should have completely ignored the old selda version. Maybe it picked some information from the cache.

@soulomoon
Copy link
Collaborator

soulomoon commentedMay 28, 2025
edited
Loading

Maybe it picked some information from the cache

We handle FOI(File of interest) and none-FOI differently.
SeegetModIfaceRule. This might be causing what you see@guibou

@guibou
Copy link
CollaboratorAuthor

We handle FOI(File of interest) and none-FOI differently.
See getModIfaceRule. This might be causing what you see@guibou

@soulomoon Thank you. This is indeed interesting. I will confirm that the bug I'm observing is not linked to my change and open an issue. This is not a dramatic issue (e.g. it is not everyday that you restart the editor with breaking changes in the packages from the package database), but it is not convenient and can be confusing.

@guibou
Copy link
CollaboratorAuthor

I've found a bug, when looking for module in the import path, an exception is raised if the import path is empty.

@guibouguibouforce-pushed theoptimise_module_to_filename branch from2d8386c tofdf3085CompareSeptember 21, 2025 11:59
@guibou
Copy link
CollaboratorAuthor

I've found a bug, when looking for module in the import path, an exception is raised if the import path is empty.

Fixed in latest commit.

@guibouguibouforce-pushed theoptimise_module_to_filename branch fromfdf3085 toc856c6aCompareOctober 25, 2025 19:50
@guibou
Copy link
CollaboratorAuthor

guibou commentedOct 25, 2025
edited
Loading

Improved the support for currently edit files but not yet saved to the disk. The implementation will be improved for performance but that's a good first draft.

This fixs a lot of unit test, so it is good.

I do have a few (around 20) tests faliling in ghcide-tests, but otherwise, most of the tests are ok.

Still todo:

  • remove theunsafePerformIO
  • Handle multi home units (some tests are failing, that's a good news)
  • Handle reexport (same, some tests are failing, that's a good news)
  • Handle reloading of the cache if files are added / removed

@guibouguibou added the performanceIssues about memory consumption, responsiveness, etc. labelNov 1, 2025
@guibou
Copy link
CollaboratorAuthor

Multiple-home-unit seems to be fixed now (maybe I've not pushed the MR), but the tests are fine. I have 3 failing tests related to-boot files and then tests will be ok.

I'll work on removing theunsafePerformIO cache (which breaks hlint test, this is great ;), maybe this cache is also responsible for the remainirg failing tests.

@guibouguibouforce-pushed theoptimise_module_to_filename branch fromc856c6a tob20c634CompareNovember 1, 2025 12:38
@guibou
Copy link
CollaboratorAuthor

Latest push should fix hlint and actually remove theunsafePerformIO: the cache is now stored in theShakeExtras.

I have 3 (2 are flaky) remaining issues with-boot tests:

    cyclic module dependency with hs-boot:                             FAIL (0.63s)      ghcide-test/exe/DiagnosticTests.hs:246:      Got diagnostics for Uri {getUri = "file:///tmp/nix-shell.1Fwv9H/hls-test-root/extra-dir-80836663907422/ModuleA.hs"} but only expected diagnostics for [NormalizedUri 5530269853045372639 "file:///tmp/nix-shell.1Fwv9H/hls-test-root/extra-dir-80836663907422/ModuleB.hs"] got [Diagnostic {_range = Range {_start = Position {_line = 1, _character = 22}, _end = Position {_line = 1, _character = 29}}, _severity = Just DiagnosticSeverity_Error, _code = Nothing, _codeDescription = Nothing, _source = Just "not found", _message = "Could not find module \8216ModuleB\8217.\nIt is not a module in the current program, or in any known package.", _tags = Nothing, _relatedInformation = Just [DiagnosticRelatedInformation {_location = Location {_uri = Uri {getUri = "file:///tmp/nix-shell.1Fwv9H/hls-test-root/extra-dir-80836663907422/ModuleA.hs"}, _range = Range {_start = Position {_line = 1, _character = 22}, _end = Position {_line = 1, _character = 29}}}, _message = "GetLocatedImports"}], _data_ = Nothing}]      Use -p '/cyclic module dependency with hs-boot/' to rerun this test only.    bidirectional module dependency with hs-boot:                      FAIL (0.63s)      ghcide-test/exe/DiagnosticTests.hs:269:      Got diagnostics for Uri {getUri = "file:///tmp/nix-shell.1Fwv9H/hls-test-root/extra-dir-80836663907424/ModuleA.hs"} but only expected diagnostics for [NormalizedUri 5396647782368786105 "file:///tmp/nix-shell.1Fwv9H/hls-test-root/extra-dir-80836663907424/ModuleB.hs"] got [Diagnostic {_range = Range {_start = Position {_line = 1, _character = 22}, _end = Position {_line = 1, _character = 29}}, _severity = Just DiagnosticSeverity_Error, _code = Nothing, _codeDescription = Nothing, _source = Just "not found", _message = "Could not find module \8216ModuleB\8217.\nIt is not a module in the current program, or in any known package.", _tags = Nothing, _relatedInformation = Just [DiagnosticRelatedInformation {_location = Location {_uri = Uri {getUri = "file:///tmp/nix-shell.1Fwv9H/hls-test-root/extra-dir-80836663907424/ModuleA.hs"}, _range = Range {_start = Position {_line = 1, _character = 22}, _end = Position {_line = 1, _character = 29}}}, _message = "GetLocatedImports"}], _data_ = Nothing}]      Use -p '/bidirectional module dependency with hs-boot/' to rerun this test only.    correct reference used with hs-boot:                               FAIL (0.84s)      ghcide-test/exe/DiagnosticTests.hs:295:      Got diagnostics for Uri {getUri = "file:///tmp/nix-shell.1Fwv9H/hls-test-root/extra-dir-80836663907426/ModuleB.hs"} but only expected diagnostics for [NormalizedUri 4327465349865231065 "file:///tmp/nix-shell.1Fwv9H/hls-test-root/extra-dir-80836663907426/ModuleC.hs"] got [Diagnostic {_range = Range {_start = Position {_line = 1, _character = 22}, _end = Position {_line = 1, _character = 29}}, _severity = Just DiagnosticSeverity_Error, _code = Nothing, _codeDescription = Nothing, _source = Just "not found", _message = "Could not find module \8216ModuleA\8217.\nIt is not a module in the current program, or in any known package.", _tags = Nothing, _relatedInformation = Just [DiagnosticRelatedInformation {_location = Location {_uri = Uri {getUri = "file:///tmp/nix-shell.1Fwv9H/hls-test-root/extra-dir-80836663907426/ModuleB.hs"}, _range = Range {_start = Position {_line = 1, _character = 22}, _end = Position {_line = 1, _character = 29}}}, _message = "GetLocatedImports"}], _data_ = Nothing}]      Use -p '/correct reference used with hs-boot/' to rerun this test only.      ```

@guibouguibouforce-pushed theoptimise_module_to_filename branch 2 times, most recently from98fc906 to2270264CompareNovember 2, 2025 10:04
@guibou
Copy link
CollaboratorAuthor

I will need a bit of help here.

I'm blocked with the following errors locally:

cabal run test:ghcide-tests -- -p "hs-boot"ghcide  diagnostics    cyclic module dependency with hs-boot:        OK (0.69s)    bidirectional module dependency with hs-boot: OK (0.64s)    correct reference used with hs-boot:          FAIL (0.99s)      ghcide-test/exe/DiagnosticTests.hs:295:      Got diagnostics for Uri {getUri = "file:///tmp/hls-test-root/extra-dir-35586222804939/ModuleB.hs"} but only expected diagnostics for [NormalizedUri 2644126075305381495 "file:///tmp/hls-test-root/extra-dir-35586222804939/ModuleC.hs"] got [Diagnostic {_range = Range {_start = Position {_line = 1, _character = 22}, _end = Position {_line = 1, _character = 29}}, _severity = Just DiagnosticSeverity_Error, _code = Nothing, _codeDescription = Nothing, _source = Just "not found", _message = "Could not find module \8216ModuleA\8217.\nIt is not a module in the current program, or in any known package.", _tags = Nothing, _relatedInformation = Just [DiagnosticRelatedInformation {_location = Location {_uri = Uri {getUri = "file:///tmp/hls-test-root/extra-dir-35586222804939/ModuleB.hs"}, _range = Range {_start = Position {_line = 1, _character = 22}, _end = Position {_line = 1, _character = 29}}}, _message = "GetLocatedImports"}], _data_ = Nothing}]      Use -p '/hs-boot/&&/correct reference used with hs-boot/' to rerun this test only.1 out of 3 tests failed (2.32s)

Note that out of the 3 tests related to-boot files, one seems to always fail, but 2 of them seems to be flaky.

In the context of this test,ModuleA cannot be found in the targetMap.

My understanding here is limited and especially, it is difficult for me to understand why my changes may have any impact on these tests. In such context, it is tempting to think that there is another problem such as a race condition / timing.

During my investigation, I found the code insessionOpts:

          pPrint ("sessionOpts", cfp, v)caseHM.lookup (toNormalizedFilePath' cfp) vofJust (opts, old_di)->do              deps_ok<- checkDependencyInfo old_di              pPrint ("deps_ok", deps_ok, old_di)ifnot deps_okthendo-- If the dependencies are out of date then clear both caches and start-- again.                  modifyVar_ fileToFlags (const (returnMap.empty))                  modifyVar_ filesMap (const (returnHM.empty))-- Keep the same name cache                  modifyVar_ hscEnvs (return.Map.adjust (const[]) hieYaml )                  pPrint ("consultCradle not deps_ok", cfp)                  consultCradle hieYaml cfpelsereturn (opts,Map.keys old_di)Nothing->do              pPrint ("consultCradle Nothing", cfp)              consultCradle hieYaml cfp

(thepPrint are my addition).

I observed that thepPrint("sessionOpts") is called multiple time for all the files in the "workspace" (for the test I care about,module{A, B}.hs[-boot].

However,consultCradle is only called once for anymoduleX.hs[-boot], e.g. if called forModuleA.hs, it won't be called formoduleA.hs-boot, or the opposite. Depending on for which "version" of the module it had been called, it will lead to a test success or failure.

It seems thatconstultCradle will populate thev map with both the.hs-boot and.hs when evaluating one module. For example, this is one trace (when I had a bit lesspPrint), additionnal comments with#

# session opts is first called for ModuleB.hs-boot, followed by a consultCradle( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleB.hs-boot")( "consultCradle Nothing", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleB.hs-boot")# Same for ModuleA.hs-boot( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleA.hs-boot")( "consultCradle Nothing", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleA.hs-boot")# Then for "ModuleA.hs"( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleA.hs")# Which does not call "consultCradle" because deps seems to be ok( "deps_ok", True, fromList [])# Same for ModuleB.hs( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleB.hs")( "deps_ok", True, fromList [])# Then the hs-boot files are called again due to the cyclic dependency...( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleB.hs-boot")( "deps_ok", True, fromList [])( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleB.hs")( "deps_ok", True, fromList [])( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleA.hs-boot")( "deps_ok", True, fromList [])( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleA.hs")( "deps_ok", True, fromList [])( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleB.hs-boot")( "deps_ok", True, fromList [])( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleA.hs")( "deps_ok", True, fromList [])( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleA.hs-boot")( "deps_ok", True, fromList [])( "sessionOpts", "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-80639723522145/ModuleB.hs")( "deps_ok", True, fromList [])

The result of this is that when callingextendKnownTargets, only twice (for eachModuleX), it calls them with differenttargetTarget. See two trace here:

    [ TargetDetails        { targetTarget = TargetFile NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-73114015013655/ModuleA.hs-boot"        , targetEnv =            ( []            , Just HscEnvEq 10            )        , targetDepends = fromList []        , targetLocations =            [ NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-73114015013655/ModuleA.hs-boot"            , NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-73114015013655/ModuleA.hs"            ]        }    , TargetDetails        { targetTarget = TargetFile NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-73114015013655/ModuleB.hs-boot"        , targetEnv =            ( []            , Just HscEnvEq 10            )        , targetDepends = fromList []        , targetLocations =            [ NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-73114015013655/ModuleB.hs-boot"            , NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-73114015013655/ModuleB.hs"            ]        }    ]

And

  [ TargetDetails        { targetTarget = TargetFile NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-911475559555/ModuleB.hs"        , targetEnv =            ( []            , Just HscEnvEq 10            )        , targetDepends = fromList []        , targetLocations =            [ NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-911475559555/ModuleB.hs"            , NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-911475559555/ModuleB.hs-boot"            ]        }    , TargetDetails        { targetTarget = TargetFile NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-911475559555/ModuleA.hs"        , targetEnv =            ( []            , Just HscEnvEq 10            )        , targetDepends = fromList []        , targetLocations =            [ NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-911475559555/ModuleA.hs"            , NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-911475559555/ModuleA.hs-boot"            ]        }    ]
    [ TargetDetails        { targetTarget = TargetFile NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-26070923945755/ModuleB.hs-boot"        , targetEnv =            ( []            , Just HscEnvEq 10            )        , targetDepends = fromList []        , targetLocations =            [ NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-26070923945755/ModuleB.hs-boot"            , NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-26070923945755/ModuleB.hs"            ]        }    , TargetDetails        { targetTarget = TargetFile NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-26070923945755/ModuleA.hs"        , targetEnv =            ( []            , Just HscEnvEq 10            )        , targetDepends = fromList []        , targetLocations =            [ NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-26070923945755/ModuleA.hs"            , NormalizedFilePath "/tmp/nix-shell.hE81CE/hls-test-root/extra-dir-26070923945755/ModuleA.hs-boot"            ]        }    ]    ```        So how each time there is only two entries for each Module, but the `targetTarget` contains either the `.hs` or the `.hs-boot` file, but `targetLocations` always contain boths.        Later, in `extendKnownTargets`:        ```    let extendKnownTargets newTargets = do          let checkAndLogFile path = do                 exists' <- IO.doesFileExist (fromNormalizedFilePath path)                 pure $ exists'          knownTargets <- concatForM  newTargets $ \TargetDetails{..} ->            case targetTarget of              TargetFile f -> do                -- If a target file has multiple possible locations, then we                -- assume they are all separate file targets.                -- This happens with '.hs-boot' files if they are in the root directory of the project.                -- GHC reports options such as '-i. A' as 'TargetFile A.hs' instead of 'TargetModule A'.                -- In 'fromTargetId', we dutifully look for '.hs-boot' files and add them to the                -- targetLocations of the TargetDetails. Then we add everything to the 'knownTargetsVar'.                -- However, when we look for a 'Foo.hs-boot' file in 'FindImports.hs', we look for either                --                --  * TargetFile Foo.hs-boot                --  * TargetModule Foo                --                -- If we don't generate a TargetFile for each potential location, we will only have                -- 'TargetFile Foo.hs' in the 'knownTargetsVar', thus not find 'TargetFile Foo.hs-boot'                -- and also not find 'TargetModule Foo'.                fs <- filterM checkAndLogFile targetLocations                pure $ map (\fp -> (TargetFile fp, Set.singleton fp)) (nubOrd (f:fs))              TargetModule _ -> do                found <- filterM checkAndLogFile targetLocations                return [(targetTarget, Set.fromList found)]          hasUpdate <- atomically $ do            known <- readTVar knownTargetsVar            let known' = flip mapHashed known $ \k -> unionKnownTargets k (mkKnownTargets knownTargets)                hasUpdate = if known /= known' then Just (unhashed known') else Nothing            writeTVar knownTargetsVar known'            pure hasUpdate          for_ hasUpdate $ \x ->            logWith recorder Debug $ LogKnownFilesUpdated (targetMap x)          return $ toNoFileKey GetKnownTargets

For each targetfile, we will create aTargetFile associated with actually, 0 files because all files are filtered out by the file exist test.

I'll continue this investigation later.

soulomoon and fendor reacted with thumbs up emoji

@guibouguibouforce-pushed theoptimise_module_to_filename branch from2270264 tocce8f01CompareNovember 2, 2025 15:05
@guibouguibouforce-pushed theoptimise_module_to_filename branch fromcce8f01 tof817d89CompareNovember 3, 2025 05:12
@guibouguibouforce-pushed theoptimise_module_to_filename branch fromf817d89 to878bfdbCompareNovember 26, 2025 13:43
@guibou
Copy link
CollaboratorAuthor

@fendor

I've rebased and rerun the CI.

Latest run was showing only a segfault with GHC 9.10 on window and one failure in the call hierarchy plugin on window with ghc 9.6, which may hence be a flakyness.

The latest commit is a tentative fix for the issue described in#4600 (comment) and that I tried to describe and extract in#4754 (And really, I should answer the poor@soulomoon who tried to help me).

In order to be in a mergable state, I need to:

  • UnderstandRace condition in -boot file setup #4754 and see if my fix is "correct"
  • Handle reloading of the module cache when a file is added/removed. I still don't know how to handle that, but maybe it could be simply done "in case of failure": if we try to find the file associated with a module, but this file is not in the cache, instead of crashing, we could try to reload the cache one time.
  • Test for reexport. I don't think I have done anything for reexport, but all tests are OK. I'm wondering if reexport are tested
  • Finally, the most important: in order to merge the "KnownTarget" (which contains the not yet written on disk files) and the module cache, I currently do anO(nm) process which completely destroy the expected performances. However, the fix is simple: I can recompute the module cache each time theKnownTarget is updated

-- and also not find 'TargetModule Foo'.
fs<- filterM (IO.doesFileExist. fromNormalizedFilePath) targetLocations
pure$map (\fp-> (TargetFile fp,Set.singleton fp)) (nubOrd (f:fs))
pure$do
Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@fendor, this is the offending change.

WhenextendKnownTargets is called with files which are not yet available on disk, theTargetFile f contains a file (either.hs or.hs-boot) andtargetLocations contains a list of files and it seems that it always contains the.hs and.hs-boot.

However, these file do not exist on the disk, so thefilterM always filter them out.

This "fix" allows the file to be added.

The bug is certainly NOT here, see#4754 for more details.

My current understanding is that there is a caching process which, when a.hs or.hs-boot file is registered, it callsextendKnownTargets with both.hs and.hs-boot files listed intargetLocations, but only the.hs or.hs-boot inTargetFile f.

However, it seems that there is some sort of caching which once a.hs-boot file or.hs file is registered, it blocks registration for the other file.

It means that depending on the registration order, we may end up with only one or the other file in theknownTarget.

The result is that during module name to filename association (the target of this MR), we won't be able to find either the.hs or the.hs-boot file, because we look into theknownTarget map for them.

Note that I don't understand why it was not failing before, because the previous implementation was actually looking in theknownTarget too, and on the filesystem. But for files not on the filesystem (as it is the case for tests and newly created files), it seems that the previous implementation was behaving the same.

My guess is that, because it seems that it is a race, my change had an impact on the wayextendKnownTargets is called concurrently and as a result the problem appears more often. But I had not been able to confirm that the issue was there before my MR.

Copy link
Collaborator

@soulomoonsoulomoonNov 26, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

If we want to extendKnownTargets, we should do so in therestart thread, see Note [Serializing runs in separate thread] . Otherwise we risk of destablizing the build system.
For the .hs or .hs-boot mess, one of the possible solution, we can update the sessionLoader to invalidate related cache and consult the cradle again if the missing .hs or .hs-boot is added.

@soulomoon
Copy link
Collaborator

Handle reloading of the module cache when a file is added/removed. I still don't know how to handle that, but maybe it could be simply done "in case of failure": if we try to find the file associated with a module, but this file is not in the cache, instead of crashing, we could try to reload the cache one time.

take a look atSMethod_TextDocumentDidOpen handler and setFileModified. You might be able to hook what you want there.

guibou reacted with thumbs up emoji

This is related to#4598.This changes the file to module associating logic done during dependencygraph building.Before, each time a module `Foo.Bar` is found, HLS is testing inside allthe import path for the existence of a relevant fiel.. It means that for`i` import paths and `m` modules to locate, `m * n` filesystemoperations are done. Note also that this involves a lot of complexstring concatenation primitive to build the `FilePath`.A module is tested for each `import` for each of the file of theproject. We also test for `boot` files, doubling the number of test.In#4598 we have a project with `1100` modules, in more than 250 importpaths and we count more than `17000` `import` statments, resulting onover 6 millions test for file existences. This project was blocking formore than 3 minutes during HLS startup.This commit changes the way this is computed:- At startup, a `Map ModuleName FilePath` (the real type is a bit more  involved for performance, multiples unit and boot files handling) is  built by scanning all the import paths for files representing the  different modules.- Directory scanning is efficient and if import path only contains  haskell module, this will never do more job that listing the files of  the project.- The lookup is now simplify a `Map` lookup.The performance improvement is as follows:- The number of IO operation is dramatically reduced, from multiples  millions to a few recursive directories listing.- A lot of the boilerplate of converting path had be removed.- TODO: add an RTS stats before / after with number of allocations- On my project, the graph building time is reduced from a few minutes  to 3s.Limitations:- How to rebuild the `Map` if the content of one directory change?- If one directory is filled with millions of files which are not of  interested, performance can be damaged. TODO: add a diagnostic during  this phase so the user can learn about this issue.Code status:- The `lookup` is not fully restored, especially it does not include the  handling of home unit as well as reexport.- The initialisation phase is cached inside a `TVar` stored as a top  level identifier using `unsafePerformIO`. This is to be improved.A note about performanceMost users won't see the benefits of these change, but I think theyapply to everbody:- We are still doing 1 lookup per `import` per module. But the lookup  result is not multiples IO, so it should be faster by a large amount.- Most project only have 1 (or a few) import paths so won't benefit as  dramatically as me from this.TODO for allocations
It does not seem to be used and apparently had never been used (after aquick look in the repository diff).Ask m.pickering about this.
The cache is now stored inside the `ShakeExtra`. I'm unsure that's theright location for it.This should fix hlint. Also, not having a cache stored as top levelentry may makes it not survive session reinit, which seams sane (forexample, during tests).Still 3 `-boot` related tests are failing. 2 are however flaky.
add all alternate file to knownFilesThis need to be investigated.The different `-boot` tests are flaky / failing in this MR because ofthis.WIP: remove debug statments
@guibouguibouforce-pushed theoptimise_module_to_filename branch from878bfdb to1c7b787CompareNovember 29, 2025 14:15
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@soulomoonsoulomoonsoulomoon left review comments

@berbermanberbermanAwaiting requested review from berbermanberberman will be requested when the pull request is marked ready for reviewberberman is a code owner

@michaelpjmichaelpjAwaiting requested review from michaelpjmichaelpj will be requested when the pull request is marked ready for reviewmichaelpj is a code owner

@wz1000wz1000Awaiting requested review from wz1000wz1000 will be requested when the pull request is marked ready for reviewwz1000 is a code owner

@fendorfendorAwaiting requested review from fendorfendor will be requested when the pull request is marked ready for reviewfendor is a code owner

At least 1 approving review is required to merge this pull request.

Assignees

No one assigned

Labels

performanceIssues about memory consumption, responsiveness, etc.

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@guibou@soulomoon

[8]ページ先頭

©2009-2025 Movatter.jp