Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat: track resource replacements when claiming a prebuilt workspace#17571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
dannykopping merged 38 commits intomainfromdk/logreplacements
May 14, 2025

Conversation

dannykopping
Copy link
Contributor

@dannykoppingdannykopping commentedApr 25, 2025
edited
Loading

Closescoder/internal#369

We can't know whether a replacement (i.e. drift of terraform state leading to a resource needing to be deleted/recreated) will take place apriori; we can only detect it atplan time, because the provider decides whether a resource must be replaced and it cannot be inferred through static analysis of the template.

This is likely to be the most common gotcha with using prebuilds, since it requires a slight template modification to use prebuilds effectively, so let's head this off before it's an issue for customers.

Drift details will now be logged in the workspace build logs:

image

Plus a notification will be sent to template admins when this situation arises:

image

A new metric -coderd_prebuilt_workspaces_resource_replacements_total - will also increment each time a workspace encounters replacements.

We only trackthat a resource replacement occurred, not how many. Just one is enough to ruin a prebuild, but we can't know apriori which replacement would cause this.
For example, say we have 2 replacements: adocker_container and anull_resource; we don't know which one might
cause an issue (or indeed if either would), so we just track the replacement.

@github-actionsgithub-actionsbot added the staleThis issue is like stale bread. labelMay 6, 2025
@dannykoppingdannykoppingforce-pushed thedk/logreplacements branch 3 times, most recently from4859410 to634082bCompareMay 7, 2025 21:06
@github-actionsgithub-actionsbot removed the staleThis issue is like stale bread. labelMay 8, 2025
@@ -75,6 +75,7 @@ message CompletedJob {
repeated provisioner.Resource resources = 2;
repeated provisioner.Timing timings = 3;
repeated provisioner.Module modules = 4;
repeated provisioner.ResourceReplacement resourceReplacements = 5;
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I believe this is backwards-compatible sincerepeated fields are effectivelyoptional if nil length.

// nolint:gocritic // Necessary to query all the required data.
ctx = dbauthz.AsSystemRestricted(ctx)
// Since this may be called in a fire-and-forget fashion, we need to give up at some point.
trackCtx, trackCancel := context.WithTimeout(ctx, time.Minute)
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is a best-effort attempt to warn operators of this situation; it's ok if it times out, we'll get a log to trace this with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I see this partially covers my earlier comment. I still think it'd be good to take shutdowns into consideration for these and define what that behavior should be (rather than undefined). Right now these routines will be rudely interrupted during shutdown rather than exiting cleanly. Likewise these can be left running even if a CompleteJob is interrupted.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Right now these routines will be rudely interrupted during shutdown rather than exiting cleanly.

Good call! I've now passed in theprovisionerdserver's lifecycle context.

Likewise these can be left running even if a CompleteJob is interrupted.

I believe that's already the case?

@@ -258,7 +258,7 @@ func getStateFilePath(workdir string) string {
}

// revive:disable-next-line:flag-parameter
func (e *executor) plan(ctx, killCtx context.Context, env, vars []string, logr logSink,destroy bool) (*proto.PlanComplete, error) {
func (e *executor) plan(ctx, killCtx context.Context, env, vars []string, logr logSink,metadata *proto.Metadata) (*proto.PlanComplete, error) {
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is the meat of the feature; everything else is just plumbing between system and user eyeball.

level := proto.LogLevel_INFO

// Terraform indicates that a resource will be deleted and recreated by showing the change along with this substring.
if bytes.Contains(line, []byte("# forces replacement")) {
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A bit flimsy; open to other ideas.
In any case, this is just sugar. The fact that the plan, with all its drift details, are shown will be sufficient. Highlighting the lines is just a courtesy to the user.

@@ -13,43 +16,65 @@ import (
"github.com/coder/coder/v2/coderd/prebuilds"
)

const (
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Muddies the purpose of the PR a bit, but it was a worthwhile driveby refactoring given that we're adding a new metric (MetricResourceReplacementsCount) and we need to check for its value in a test.

@dannykoppingdannykopping changed the titleWIP! feat: log resource replacements when claiming a prebuilt workspacefeat: track resource replacements when claiming a prebuilt workspaceMay 8, 2025
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
…replacement(s)Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
also a test that was broken from an earlier fixSigned-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
@dannykoppingdannykopping marked this pull request as ready for reviewMay 8, 2025 12:59
Copy link
Member

@mafredrimafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

No major issues uncovered, my main concern is the goroutine management in CompleteJob, other than that mostly minor nits and suggestions.

dannykopping added a commit that referenced this pull requestMay 12, 2025
…17757)Used in combination withcoder/terraform-provider-coder#396This is required by both#17475 and#17571Operators may need to conditionalize their templates to perform certainoperations once a prebuilt workspace has been claimed. This value will**only** be set once a claim takes place and a subsequent `terraformapply` occurs. Any `terraform apply` runs thereafter will beindistinguishable from a normal run on a workspace.---------Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Copy link
Member

@mafredrimafredri left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A few more comments but my main concern is around using theatomic.Pointer for the notifier, I don't really see a reason for it in which case I think the code should be simplified.

The other remaining concern is around goroutine/shutdown which could lead to test flakes (leak detection).

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
… APIavoid all this pointer jugglingSigned-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
…nywaySigned-off-by: Danny Kopping <dannykopping@gmail.com>
Copy link
Member

@mafredrimafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nice work!

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Copy link
Contributor

@spikecurtisspikecurtis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

My review limited to the protobuf / version changes.

@dannykoppingdannykoppingenabled auto-merge (squash)May 14, 2025 09:51
@dannykoppingdannykopping merged commit6e96778 intomainMay 14, 2025
39 checks passed
@dannykoppingdannykopping deleted the dk/logreplacements branchMay 14, 2025 12:52
@github-actionsgithub-actionsbot locked and limited conversation to collaboratorsMay 14, 2025
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.
Reviewers

@evgeniy-scherbinaevgeniy-scherbinaevgeniy-scherbina left review comments

@github-actionsgithub-actions[bot]github-actions[bot] left review comments

@mafredrimafredrimafredri approved these changes

@spikecurtisspikecurtisspikecurtis approved these changes

@SasSwartSasSwartAwaiting requested review from SasSwart

@johnstcnjohnstcnAwaiting requested review from johnstcnjohnstcn is a code owner

Assignees

@dannykoppingdannykopping

Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

Inform template admins that resources will be replaced
4 participants
@dannykopping@mafredri@spikecurtis@evgeniy-scherbina

[8]ページ先頭

©2009-2025 Movatter.jp