Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Parallel Powershell Sessions Lead to Corruption of the StartUpProfileData Cache #26528

Open
Labels
Issue-BugIssue has been identified as a bug in the productWaiting - DotNetCorewaiting on a fix/change in .NET
@JulianHangstoerferEnscape

Description

Prerequisites

Steps to reproduce

We are using powershell core as the execution shell for self-managed GitLab runners on macOS that don't use any sort of virtualization. Some of these machines can execute up to 5 jobs in parallel for very light workloads. For each job a non-interactive powershell session is launched, and exited again once the job is complete. These parallel jobs are corrupting the~/.cache/powershell/StartupProfileData-NonInteractive cache file, which then causes a meriad of issues like load assembly errors, segfeaults and stackoverflows until the cache is fixed. Deleting the file fixes the problem temporarily but it will eventually come back.

I have managed to boil down the issue into a small powershell script to simulate the parallel nature of our GitLab runners. We do not run 64 jobs in parallel but it hopefully speeds up the reproduction of the problem. This is not a perfect reproduction of the problem, because when running so many session in parallel it guarantees that the corrupted cache instantly gets overriden by a different powershell session and thus the problem is fixed again. On our real runners it's likely that just 2 parallel powershell session clash with eachother and leave the corrupted cache on the filesystem.

Running this on macOS can take between 1 and 300 tries for me to reproduce the problem.

$parallelJobs=64$iteration=0while ($true) {$iteration++Write-Host"Iteration$iteration"$results=1..$parallelJobs|ForEach-Object-Parallel {# This parallel start of pwsh is the actual problem.$output= pwsh-NonInteractive-Command {# Simulate some workforeach ($itemin1..10) {                [Math]::Sqrt($item)|Out-Null            }        }2>&1return [PSCustomObject]@{Output=$outputExitCode=$LASTEXITCODE        }    }-ThrottleLimit$parallelJobs# Print errors$errorResults=$results|Where-Object {$_.ExitCode-ne0 }foreach ($resultin$errorResults) {$exitCode=$result.ExitCode$output=$result.Output-join"`n"Write-Hostif ($exitCode-eq139) {Write-Host"Segmentation fault (139) in job. Output:`n$output"        }elseif ($exitCode-eq134) {Write-Host"Sigabort (134, cache corruption) in job. Output:`n$output"        }else {Write-Host"Unknown error ($exitCode) in job. Output:`n$output"        }    }}

Workarounds

  1. Deleting the corrupt cache file~/.cache/powershell/StartupProfileData-NonInteractive fixes the problem temporarily until it becomes corrupt again.
  2. Setting$Env:PSModuleAnalysisCachePath or$Env:XDG_CACHE_HOME to something unique per parallel runner likely solves the problem to avoid parallel access to the cache files.
  3. Making the cache file readonly prevents corruption but might not be appropriate for all use-cases
    chmod -w ~/.cache/powershell/StartupProfileData-NonInteractive

Expected behavior

Parallel non-interactive powershell sessions can be executed without problems.

Actual behavior

Parallel non-interactive powershell sessions interfere with each other by corrupting the profile cache, which prevents new sessions from being started. It's not even possible to runpwsh -c Write-Host "foo" on the agent when the cache is corrupt.

Error details

Get-Error itself does not print anything there are multiple other errors.

  1. pwsh exits with 134 SIGABRT. This is accompied by an error like below with a random assembly that usually contains some corruption. Usually the front or back of the string is truncated or possibly showing random unicode characters likePublicKeyToken=b03f5f7f11ᄚ.
    There is an attached crash report of this happening.
#Output from the reproduction scriptIteration 156Sigabort (134, cache corruption) in job. Output:An error has occurred that was not properly handled. Additional information is shown below. The PowerShell process will exit.Unhandled exception. System.IO.FileLoadException: The given assembly name was invalid.File name: 'System.Runtime.Numerics, Version=9.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11ᅠ'   at System.Reflection.AssemblyNameParser.Parse(ReadOnlySpan`1 name)   at System.Reflection.AssemblyName.ParseAsAssemblySpec(Char* pAssemblyName, Void* pAssemblySpec)Iteration 157
  1. pwsh exits with 134 SIGABRT. This is accompied by a stackoverflow. This seems to be related to this bugStack overflow error and SIGABRT due to corrupted cache #26431
    Unfortunately this does not seem to create crash reports on macOS.
#Output from the reproduction scriptIteration 190Sigabort (134, cache corruption) in job. Output:Stack overflow.Sigabort (134, cache corruption) in job. Output:Stack overflow.Iteration 191
  1. pwsh exits with 139 segmentation fault and no output. I'm not sure if this is also due to the corrupt cache, in GitLab our problems always manifested itself with either the assembly load errors and rarely the stackoverflow.
    There is an attached crash report of this happening.
#Output from the reproduction scriptIteration 77Segmentation fault (139) in job. Output:Iteration 78

Environment data

$PSVersionTableName                           Value---------PSVersion7.5.4PSEdition                      CoreGitCommitId7.5.4OS                             Darwin25.0.0 Darwin Kernel Version25.0.0: Wed Sep1721:39:53 PDT2025; root:xnu-1237…Platform                       UnixPSCompatibleVersions           {1.0,2.0,3.0,4.0…}PSRemotingProtocolVersion2.3SerializationVersion1.1.0.1WSManStackVersion3.0
                     ..'                 ,xNMM.           -------------------               .OMMMMo            OS: macOS Tahoe 26 26.0.1 x86_64               lMM"               Host: iMac (Retina 5K, 27-inch, 2020)     .;loddo:.  .olloddol;.       Kernel: Darwin 25.0.0   cKMMMMMMMMMMNWMMMMMMMMMM0:     Uptime: 12 hours, 29 mins .KMMMMMMMMMMMMMMMMMMMMMMMWd.     Packages: 118 (brew), 5 (brew-cask) XMMMMMMMMMMMMMMMMMMMMMMMX.       Shell: zsh 5.9;MMMMMMMMMMMMMMMMMMMMMMMM:        Display (iMac): 5120x2880 @ 60 Hz (as 2560x1440) in 27" [Built-in]:MMMMMMMMMMMMMMMMMMMMMMMM:        DE: Aqua.MMMMMMMMMMMMMMMMMMMMMMMMX.       WM: Quartz Compositor 341.0.1 kMMMMMMMMMMMMMMMMMMMMMMMMWd.     WM Theme: Multicolor (Light) 'XMMMMMMMMMMMMMMMMMMMMMMMMMMk    Font: .AppleSystemUIFont [System], Helvetica [User]  'XMMMMMMMMMMMMMMMMMMMMMMMMK.    Cursor: Fill - Black, Outline - White (32px)    kMMMMMMMMMMMMMMMMMMMMMMd      Terminal: /dev/ttys002     ;KMMMMMMMWXXWMMMMMMMk.       CPU: Intel(R) Core(TM) i7-10700K (16) @ 3.80 GHz       "cooc*"    "*coo'"         GPU 1: AMD Radeon Pro 5500 XT [Integrated]                                  GPU 2: Intel HD Graphics CFL [Integrated]                                  Memory: 5.15 GiB / 8.00 GiB (64%)                                  Swap: 58.25 MiB / 1.00 GiB (6%)                                  Disk (/): 350.46 GiB / 465.63 GiB (75%) - apfs [Read-only]                                  Locale: C.UTF-8

It's also reproducible on a m1 and a m4 mac but it seems more difficult to corrupt.

$PSVersionTableName                           Value---------PSVersion7.5.4PSEdition                      CoreGitCommitId7.5.4OS                             Darwin25.0.0 Darwin Kernel Version25.0.0: Wed Sep1721:41:39 PDT2025; root:xnu-12377.1.9~141/RELEASE_ARM64_T8103Platform                       UnixPSCompatibleVersions           {1.0,2.0,3.0,4.0…}PSRemotingProtocolVersion2.3SerializationVersion1.1.0.1WSManStackVersion3.0
                     ..'                 ,xNMM.           --------------------               .OMMMMo            OS: macOS Tahoe 26 26.0.1 arm64               lMM"               Host: MacBook Air (M1, 2020)     .;loddo:.  .olloddol;.       Kernel: Darwin 25.0.0   cKMMMMMMMMMMNWMMMMMMMMMM0:     Uptime: 12 hours, 42 mins .KMMMMMMMMMMMMMMMMMMMMMMMWd.     Packages: 172 (brew), 10 (brew-cask) XMMMMMMMMMMMMMMMMMMMMMMMX.       Shell: zsh 5.9;MMMMMMMMMMMMMMMMMMMMMMMM:        Display (Color LCD): 2880x1800 @ 60 Hz (as 1440x900) in 13" [Built-in]:MMMMMMMMMMMMMMMMMMMMMMMM:        DE: Aqua.MMMMMMMMMMMMMMMMMMMMMMMMX.       WM: Quartz Compositor 341.0.1 kMMMMMMMMMMMMMMMMMMMMMMMMWd.     WM Theme: Multicolor (Light) 'XMMMMMMMMMMMMMMMMMMMMMMMMMMk    Font: .AppleSystemUIFont [System], Helvetica [User]  'XMMMMMMMMMMMMMMMMMMMMMMMMK.    Cursor: Fill - Black, Outline - White (32px)    kMMMMMMMMMMMMMMMMMMMMMMd      Terminal: /dev/ttys000     ;KMMMMMMMWXXWMMMMMMMk.       CPU: Apple M1 (8) @ 3.20 GHz       "cooc*"    "*coo'"         GPU: Apple M1 (8) [Integrated]                                  Memory: 5.60 GiB / 8.00 GiB (70%)                                  Swap: 710.44 MiB / 2.00 GiB (35%)                                  Disk (/): 276.81 GiB / 460.43 GiB (60%) - apfs [Read-only]                                  Battery (bq20z451): 100% [AC connected]                                  Power Adapter: 30W USB-C Power Adapter                                  Locale: C.UTF-8

Visuals

Screenshot of the corruption happening with the repro script:
Image

Screenshot of the corruption happening in GitLab when trying to set up the shell session:
Image

Screenshot of the contents of a corruptedStartupProfileData-NonInteractive with truncated assembly string.
Image

macOS.ips crash report files.
macOS crash report.zip

Screenshot of a stacktrace from the macOS crash reports
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue-BugIssue has been identified as a bug in the productWaiting - DotNetCorewaiting on a fix/change in .NET

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp