Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

fix: fix flake in TestExecutorAutostartSkipsWhenNoProvisionersAvailable#19478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
cstyan merged 6 commits intomainfromcallum/exec-autostart-stale-flake
Aug 28, 2025

Conversation

cstyan
Copy link
Contributor

The flake here had two causes:

  1. related to usage of time.Now() in MustWaitForProvisionersAvailable and
  2. the fact that UpdateProvisionerLastSeenAt can not use a time that is further in the past than the current LastSeenAt time

Previously the test here was callingcoderdtest.MustWaitForProvisionersAvailable which was usingtime.Now rather than the next tick time like the realhasProvisionersAvailable function does. Additionally, when usingUpdateProvisionerLastSeenAt the underlying db query enforces that the time we're trying to setLastSeenAt to cannot be older than the current value.

I was able to reliably reproduce the flake by executing both theUpdateProvisionerLastSeenAt call andtickCh <- next in their own goroutines, the former with a small sleep to reliably ensure we'd trigger the autobuild before we set theLastSeenAt time. That's when I also noticed thatcoderdtest.MustWaitForProvisionersAvailable was usingtime.Now instead of the tick time. When I updated that function to take in a tick time + added a 2nd call toUpdateProvisionerLastSeenAt to set an original non-stale time, we could then never get the test to pass because the later call to set the stale time would not actually modifyLastSeenAt. On top of that, calling the provisioner daemons closer in the middle of the function doesn't really do anything of value in this test.

The fix for the flake is to keep the go routines, ensuring there would be a flake if there was not a relevant fix, but to include the fix which is to ensure that we explicitly wait for the provisioner to be stale before passing the time totickCh.

related to usage of time.Now() in MustWaitForProvisionerAvailable andthe fact that UpdateProvisionerLastSeenAt can not use a time that isfurther in the past than the current LastSeenAt timeSigned-off-by: Callum Styan <callumstyan@gmail.com>
Comment on lines 1737 to 1738
require.NoError(t,err,"Error getting provisioner for workspace")
require.Equal(t,p.LastSeenAt.Time.UnixNano(),staleTime.UnixNano())
Copy link
Member

@ethanndicksonethanndicksonAug 22, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Be mindful of the fact you can't callrequire methods in a goroutine other than the main test goroutine. They all callFailNow, and from the documentation:

FailNow must be called from the goroutine running the test or benchmark function, not from other goroutines created during the test. Calling FailNow does not stop those other goroutines.

https://pkg.go.dev/testing#F.FailNow

You can callassert functions, however!

Signed-off-by: Callum Styan <callumstyan@gmail.com>
provisionerSigned-off-by: Callum Styan <callumstyan@gmail.com>
transaction does not go through immediatelySigned-off-by: Callum Styan <callumstyan@gmail.com>
Comment on lines 1660 to 1680
require.Eventually(t,func()bool {
// Use the same logic as hasValidProvisioner but expect false
provisionerDaemons,err:=db.GetProvisionerDaemonsByOrganization(ctx, database.GetProvisionerDaemonsByOrganizationParams{
OrganizationID:workspace.OrganizationID,
WantTags:tags,
})
iferr!=nil {
returnfalse
}

// Check if NO provisioners are active (all are stale or gone)
for_,pd:=rangeprovisionerDaemons {
ifpd.LastSeenAt.Valid {
age:=checkTime.Sub(pd.LastSeenAt.Time)
ifage<=provisionerdserver.StaleInterval {
returnfalse// Found an active provisioner, keep waiting
}
}
}
returntrue// No active provisioners found
},testutil.WaitMedium,testutil.IntervalFast)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

suggest using methods intestutil/chan.go and passing in a parent context in the method signature

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not sure what you're suggesting we actually change here since this set ofMustWaitForX functions don't make use of a channel.

Did you mean to leave this comment on one of thestats := <-statsCh lines that are in some of the actual test functions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry, I got mixed up here. The method I wanted to highlight (testutil.Eventually) is actually defined intestutil/eventually.go.

The problem with the below

ctx := testutil.Context(t, testutil.WaitMedium)<setup code>require.Eventually(t, func bool() {<assertions that should eventually be true>}, testutil.WaitMedium, testutil.IntervalFast)

is thatctx can reach its deadline before the condition inrequire.Eventually is met, and is therefore prone to flakiness.

My suggestion is instead to use

ctx := testutil.Context(t, testutil.WaitMedium)<setup code>testutil.Eventually(t, func bool() {<assertions that should eventually be true>}, testutil.IntervalFast)

This will adhere to the deadline imposed byctx and be less flake-prone. Apologies for the confusion.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

No worries at all 👍 I refactored all of the test helpers I'd written in this file (MustWaitForX) to make use oftestutil.Eventually.

require.EventuallySigned-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Copy link
Member

@johnstcnjohnstcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM modulo formatting and nits

coderdtest.UpdateProvisionerLastSeenAt(t,db,p.ID,staleTime)
p,err=coderdtest.GetProvisionerForTags(db,time.Now(),workspace.OrganizationID,provisionerDaemonTags)
assert.NoError(t,err,"Error getting provisioner for workspace")
assert.Eventually(t,func()bool {returnp.LastSeenAt.Time.UnixNano()==staleTime.UnixNano() },testutil.WaitMedium,testutil.IntervalFast)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nit: Suggest instead checking less than or equal to staleTime instead instead?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This should be fine,lastSeenAt should never be updated within the execution of this testother than via this direct call tocoderdtest.UpdateProvisionerLastSeenAt, since that ensures the provisioner is stale and the autobuild will then not execute. We want to ensure it's at leaststaleTime or very close to it, as when we create a provisioner via the various test helpers it gets a defaultLastSeenAt time of like January 2024.

We could make this more explicitly enforced by also double checking that in addition to thestats.Transitions length being 0, assert that theLastSeenAt time for this provisioner is still equal tostaleTime.

@cstyancstyan merged commit321c2b8 intomainAug 28, 2025
26 checks passed
@cstyancstyan deleted the callum/exec-autostart-stale-flake branchAugust 28, 2025 19:07
@github-actionsgithub-actionsbot locked and limited conversation to collaboratorsAug 28, 2025
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

@ethanndicksonethanndicksonethanndickson left review comments

@johnstcnjohnstcnjohnstcn approved these changes

Assignees

@cstyancstyan

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@cstyan@johnstcn@ethanndickson

[8]ページ先頭

©2009-2025 Movatter.jp