- Notifications
You must be signed in to change notification settings - Fork1.1k
fix(coderd/prometheusmetrics)!: filter deleted wsbuilds to reduce db load#19197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
b8a3784 to068dbd6Compare| require.NoError(t,err) | ||
| forrange1+n { | ||
| insertDeleted(t,db,u,org) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Review: This fails in the old implementation, but I thought it best to formalize the assumption. Considering the old implementation did not test for this, it seems like validation for the new approach.
This change removes the `GetLatestWorkspaceBuilds` query which includesall workspaces for all time (including deleted). This allows us to alsostop using `GetProvisionerJobsByIDs` for said builds as the job statusis included in `GetWorkspaces` called separately.Fixescoder/internal#717
068dbd6 to7b1fc44Comparespikecurtis commentedAug 6, 2025
Soft deleted workspaces have been around forever, as has this metric. I wonder if@kylecarbs can reach back that far in his memory and recall whether it was a deliberate decision to incluce builds from deleted workspaces in the The description of the metric talks about "current" number of workspace builds by status, which makes me think it was just an oversight. Since we track the latest build per workspace, not just all builds, the metric isn't useful for tracking an overall rate of succeeding / failing builds. My take is that the most obvious interpretation is that it's for non-deleted workspaces and this change is fine. |
mafredri commentedAug 6, 2025
That is the conclusion I also came to when I read the descriptions, which is why I went with this approach. If need be, we can include deleted builds with a specialized query that selects the least amount of data needed. Previously we selected all data only to use a single column which is why optimizations were not possible. |
johnstcn left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This change makes sense to me.
dannykopping left a comment• edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Seems reasonable 👍
Please ensure the semantics of this change are documented in the release notes for the next release.
We could clarify the definition of this metric too, which may help operators:
https://coder.com/docs/admin/integrations/prometheus#available-metrics
Currently it's quite brief: "The latest workspace builds with a status."
Uh oh!
There was an error while loading.Please reload this page.
mafredri commentedAug 8, 2025
I've updated the PR to be a "breaking" change as well as describe the changed behavior. I've also updated the helps for the metrics to mention that it's for non-deleted workspaces. |
1b66495 intomainUh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
This change removes the
GetLatestWorkspaceBuildsquery which includesall workspaces for all time (including deleted). This allows us to also
stop using
GetProvisionerJobsByIDsfor said builds as the job statusis included in
GetWorkspacescalled separately.BREAKING CHANGE: The
coderd_api_workspace_latest_buildPrometheusmetric no longer includes builds belonging to deleted workspaces, as
such, this metric will show fewer statuses.
Fixescoder/internal#717