Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

experiment: provisionerdaemon - investigate intermittent job wait failure#146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

bryphe-coder
Copy link
Contributor

@bryphe-coderbryphe-coder commentedFeb 2, 2022
edited
Loading

Investigating failures like this:https://github.com/coder/coder/runs/5043435263?check_suite_focus=true#step:7:32

where it looks like the job is completed, but the completion condition is never satisfied

Runs

Issue 1: Data race inp.acquiredJobDone context (provisioners.d)

Failure trace:https://github.com/coder/coder/runs/5044320845?check_suite_focus=true#step:7:84

There is a race in thep.acquiredJobDone chan - in particular, there can be a case where we're waiting on the channel to finish (in close) with<-p.acquiredJobDone, but in parallel, anacquireJob could've been started, which would create a new channel forp.acquiredJobDone.

The fix I tried was to also grab theacquiredJobMutex in theClose function. This, at first, caused a deadlock - because there was another case where the mutexes could be grabbed in reverse order (acquiredJobMutex -> thencloseMutex). That other place, though, was storing a bool in an atomic, so actually didn't need the mutex guard.

Attempted fix here:42ce721

Still hit a related race:https://github.com/coder/coder/runs/5044320845?check_suite_focus=true#step:7:84

So tried a second fix here:84dd68a

The second fix didn't work, trying to switch from chan -> wait group:a8725cd

  • Run 1: ✅
  • Run 2: peer failure

@bryphe-coderbryphe-coder changed the base branch frommain toprovisionerdaemonFebruary 2, 2022 22:34
@bryphe-coderbryphe-coder marked this pull request as draftFebruary 2, 2022 22:34
@codecov
Copy link

codecovbot commentedFeb 2, 2022
edited
Loading

Codecov Report

Merging#146 (a8725cd) intoprovisionerdaemon (03ed951) willdecrease coverage by0.02%.
The diff coverage is75.00%.

Impacted file tree graph

@@                  Coverage Diff                  @@##           provisionerdaemon     #146      +/-   ##=====================================================- Coverage              67.35%   67.33%   -0.03%=====================================================  Files                    101      101                Lines                   5098     5100       +2       Branches                  68       68              =====================================================  Hits                    3434     3434+ Misses                  1357     1354       -3- Partials                 307      312       +5
FlagCoverage Δ
unittest-go-macos-latest63.48% <70.00%> (-0.71%)⬇️
unittest-go-ubuntu-latest66.55% <75.00%> (+0.30%)⬆️
unittest-go-windows-latest63.67% <70.00%> (+0.09%)⬆️
unittest-js64.92% <ø> (ø)
Impacted FilesCoverage Δ
provisionerd/provisionerd.go72.38% <70.00%> (-0.25%)⬇️
provisioner/terraform/provision.go76.02% <100.00%> (+0.33%)⬆️
peer/conn.go75.19% <0.00%> (-3.62%)⬇️
peer/channel.go84.14% <0.00%> (-3.05%)⬇️
coderd/provisionerdaemons.go47.33% <0.00%> (+3.68%)⬆️

Continue to review full report at Codecov.

Legend -Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing data
Powered byCodecov. Last update03ed951...a8725cd. Read thecomment docs.

@bryphe-coderbryphe-coder self-assigned thisFeb 2, 2022
@bryphe-coder
Copy link
ContributorAuthor

Distilled this out into#148 and#149

@kylecarbskylecarbs deleted the bryphe/provisionerdaemon/history-failure branchMarch 23, 2022 16:25
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

1 participant

@bryphe-coder

[8]ページ先頭

©2009-2025 Movatter.jp