- Notifications
You must be signed in to change notification settings - Fork1k
fix: don't create autostart workspace builds with no available provisioners#19067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
provisioner is not availableSigned-off-by: Callum Styan <callumstyan@gmail.com>
transition so that we don't accumulate autostart jobs that willeventually have to be cleaned up by the reaper routineSigned-off-by: Callum Styan <callumstyan@gmail.com>
DB/provisioner to be available, so add a way to skip the checkSigned-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
…ndlingThis commit addresses the requirements to:1. Comment out provisionerCloser.Close() in TestExecutorAutostartSkipsWhenNoProvisionersAvailable2. Remove the need for SkipProvisionerCheck by refactoring testsChanges:- Remove SkipProvisionerCheck field and logic from Executor- Remove SkipProvisionerCheck option from coderdtest.Options- Add proper provisioner daemon tags to match template requirements- Add helper functions for waiting for provisioner availability- Update tests to use real provisioner availability checks- Fix timing issues in provisioner staleness tests- Comment out provisioner close call to test proper behaviorThe test now correctly validates that provisioners remain availablewhen not explicitly closed, and uses the real hasAvailableProvisionerslogic instead of bypassing it.Signed-off-by: Callum Styan <callumstyan@gmail.com>
9971794
to78a8f5e
Compareautobuild path have a valid provisioner available and pass anappropriate time.Time to the channel for triggering the build based onwhether the provisioner is supposed to be valid or staleSigned-off-by: Callum Styan <callumstyan@gmail.com>
Okay@deansheather thanks for your patience, I think this is finally ready for another review. Took quite a bit of digging with the help of Blink to find all the bits and pieces that had to be updated to get the tests working without the The TL; DR of the changes is that we in tests which deal with autobuilds we need:
|
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
TestExecutorAutostartSkipsWhenNoProvisionersAvailableSigned-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
any before thatSigned-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
p,err:=coderdtest.GetProvisionerForTags(db,time.Now(),workspace.OrganizationID,map[string]string{}) | ||
require.NoError(t,err) | ||
// When: the autobuild executor ticks after the scheduled time | ||
gofunc() { | ||
tickCh<-sched.Next(workspace.LatestBuild.CreatedAt) | ||
tickTime:=sched.Next(workspace.LatestBuild.CreatedAt) | ||
coderdtest.UpdateProvisionerLastSeenAt(t,db,p.ID,tickTime) | ||
tickCh<-tickTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
W.R.T the last comment from the previous review, this is what I'm referring to. Is this change (and other similar changes) necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yes, AFAICT it is. Otherwise the test is flakey, it would only pass if you run the test withinStaleInterval
seconds of the top of any hour.
When one of these tests starts, the provisioner has aLastSeenAt
time oftime.Now()
(sometimes + 1 second or so), butsched.Next(workspace.LatestBuild.CreatedAt)
will return a value that is the top of the hour.
For example: right now if I comment out theUpdateProvisionerLastSeenAt
line here, inhasValidProvisioner
thepd.LastSeenAt
time is2025-08-12 16:34:02.908042 +0000 UTC
butt
from thetickCh
is2025-08-12 17:00:00 +0000 UTC
. So in a real running deployment of Coder that provisioner would be considered stale.
This is the exact thing we were trying to solve for with the addition ofhasValidProvisioner
. So any test case which utilized the autobuild functionality but needed to continue passing needed to update theLastSeenAt
value. In the test I have added, which should see the provisioner as stale and then assert that there were no autobuild transitions initiated, we ensure theLastSeenAt
time is far enough in the past that the provisioner is stale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ahhh I get it, because the autostart time is way in the future.
Perhaps a better option for these autostarty tests to set a value on the lifecycle executor likeTestIgnoreNoDaemons: true
, and then just set it to true for these tests. Avoids having to get the provisioners and update them right before ticking, which feels like it could be racey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Isn't that essentially just going back to what I had originally?#19067 (comment)
I guess we're doing the configuration in a cleaner way, but we'd still be setting that value to true for essentially all of these tests, which is what we wanted to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yeah I think I may have looped... sorry. I think what you have now is good then 👍
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
new testSigned-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
nil hereSigned-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
6c902a7
intomainUh oh!
There was an error while loading.Please reload this page.
This shouldfix#17941
With just the test addition the test will fail since the workspace build is triggered, we attempt to move the workspace into the
start
state:require.Equal(t, codersdk.WorkspaceTransitionStop, workspace.LatestBuild.Transition)
With the 2nd commit we skip the calls to
wsbuilder.New
andbuilder.Build
and so the transition is never scheduled.However, many tests don't fully setup provisioner or wait for the provisioner + DB to be fully available, so the third commit adds a somewhat smelly way to skip the provisioner check in tests that don't need to test that functionality.