Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

chore: optimize GetPrebuiltWorkspaces query#18717

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

johnstcn
Copy link
Member

@johnstcnjohnstcn commentedJul 2, 2025
edited
Loading

Hopefullyfixescoder/internal#715 this time.

Second attempt at this.
Previous attempt incorrectly returned all rows for which there existed a prebuild that had previously had a successful start transition at some point:#18588

Explain before (71.1ms):https://explain.dalibo.com/plan/9be18ab833b7a000
Explain after (9.8ms):https://explain.dalibo.com/plan/b4a94742gaha229g (EDIT: not selecting from correct CTE)
Explain after (11.2ms):https://explain.dalibo.com/plan/bea42b563ff7fbe7

Manually verified against dogfood db:

$ psql '<db url>' -f a.sql > a.txt$ psql '<db url>' -f b.sql > b.txt$ diff a.txt b.txt28d27<  18aa3c11-c1eb-4fc6-ae3b-8cbd01f81f21 | prebuild-hiyghekkb3gv2vq | 99064381-8750-407f-9d30-91f38b7911cc | 5d91f37e-5872-4821-b6ed-90574082628e | bdc27adb-7a8e-4a6f-9922-5a0f4a3885da | f     | 2025-07-02 13:34:58.537656+0032a32>  18aa3c11-c1eb-4fc6-ae3b-8cbd01f81f21 | prebuild-hiyghekkb3gv2vq | 99064381-8750-407f-9d30-91f38b7911cc | 5d91f37e-5872-4821-b6ed-90574082628e | bdc27adb-7a8e-4a6f-9922-5a0f4a3885da | f     | 2025-07-02 13:34:58.537656+00$ diff <(sort a.txt) <(sort b.txt)$

Diff is only due to unstable row order, also added additional testing incoderd/database/querier_test.go to validate changes.

If we want to be super careful about this, I can instead break out the updated query into a new function and diff the old versus the new on each reconcile call.
EDIT: decided to go ahead with this for safety.

@johnstcnjohnstcn self-assigned thisJul 2, 2025
@johnstcnjohnstcn marked this pull request as ready for reviewJuly 2, 2025 14:20
@@ -60,7 +111,9 @@ SELECT
FROM workspace_prebuilds p
INNER JOIN workspace_latest_builds b ON b.workspace_id = p.id
WHERE (b.transition = 'start'::workspace_transition
AND b.job_status = 'succeeded'::provisioner_job_status);
AND b.job_status = 'succeeded'::provisioner_job_status)
ORDER BY p.id;
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

review: adding stable ordering for diffing

@johnstcnjohnstcnforce-pushed thecj/chore/GetRunningPrebuiltWorkspace-Optimize-2 branch fromda75520 to53c3ba5CompareJuly 3, 2025 13:08
@johnstcnjohnstcn requested a review fromCopilotJuly 3, 2025 16:06
Copy link
Contributor

@CopilotCopilotAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Pull Request Overview

This PR introduces and validates an optimized version of theGetRunningPrebuiltWorkspaces query, compares its output against the original for correctness, and wires it through the reconciler, mocks, metrics, and DB layers.

  • Add a new optimized SQL query and corresponding Go method (GetRunningPrebuiltWorkspacesOptimized)
  • Integrate a comparator in the reconciler to log differences between original and optimized results
  • Expand tests and mocks to cover the new optimized method

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
FileDescription
enterprise/coderd/prebuilds/reconcile_test.goAdded antagonists setup andCompareGetRunningPrebuiltWorkspacesResults tests
enterprise/coderd/prebuilds/reconcile.goInvoke optimized query, compare results, and log diffs
coderd/database/queries/prebuilds.sqlDefinedGetRunningPrebuiltWorkspacesOptimized CTE and ordered original query
coderd/database/queries.sql.goAdded constant, row type, and method for optimized query
coderd/database/querier.goExtended interface with optimized method
coderd/database/dbmock/dbmock.goMock recorder for optimized method
coderd/database/dbmetrics/querymetrics.goMetrics wrapper for optimized method
coderd/database/dbmem/dbmem.goStub (panicking) implementation for optimized method
coderd/database/dbauthz/setup_test.goSkipped optimized method in authz tests
coderd/database/dbauthz/dbauthz_test.goExcluded optimized method from recursive authz test
coderd/database/dbauthz/dbauthz.goAdded authorization wrapper for optimized method
Comments suppressed due to low confidence (1)

coderd/database/querier_test.go:5025

  • Consider adding a parallel test for 'GetRunningPrebuiltWorkspacesOptimized' to verify the optimized SQL returns the same results as the original under PostgreSQL.
// TestGetRunningPrebuiltWorkspaces ensures the correct behavior of the

Copy link
Member

@mafredrimafredri left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks mostly alright, mainly just questioning the purpose/point of theworkspace_latest_presets CTE.

WHERE workspace_builds.transition = 'start'::workspace_transition
AND workspace_builds.template_version_preset_id IS NOT NULL
ORDER BY latest_prebuilds.id, workspace_builds.build_number DESC
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I can't quite grasp the purpose of this CTE, it seems to pretty much do the same aslatest_prebuilds? It's selecting distinct on workspace id fromlatest_prebuilds and joiningworkspace_builds on workspace id (we already have this data in the previous CTE, no?), and ordering by build number DESC, combined with distinct on that's == selecting directly fromworkspace_latest_builds, right? If we can't select thetemplate_version_preset_id directly from theworkspace_latest_builds then it'd still be more performant to join on build ID vs workspace ID and avoid t he distinct/order/where transition clause.

Might make more sense to just joinworkspace_builds inlatest_prebuilds instead to gettemplate_version_preset_id if required and remove this CTE.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

When I omitted this CTE it caused different results for one specific prebuild instance, but I'm now not seeing the same discrepancy when I run the queries side-by-side. I'll try the modifications you suggest and see if it still yields the same results.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The more I look at that CTE, the more I question it too. On the dogfood database, all of the workspace builds owned by the prebuild user have a non-null template_version_preset_id, whereasalmost all subsequent ones have a null preset ID.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

AFAIU, whenever a template version is added, new presets are inserted into the database. I don't think we ever delete presets from the database. Except maybe when a template is deleted?
A prebuild should always correlate to a non-null preset. Could this be an issue when the active template version changes? 🤔

Copy link
MemberAuthor

@johnstcnjohnstcnJul 8, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

After spending some more time looking at this, it appears that thecurrent correct approach is to query the most recent non-null preset ID for a successful start transition. Simply selecting fromworkspace_latest_builds may not result in the correctpreset_id. I discussed with Sas and a future change may result inpreset_id not being a thing any more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Prebuilt workspaces should ideally never change their preset. This means that the first build should contain a preset id and all subsequent buildsshould have a null preset ID. Wecould optimise this by dropping theworkspace_latest_presets CTE from the query and instead rely on the preset ID for build 0 in all cases.

This would probably be much more performant.

The issue with this optimisation is the potentially negligent edge case where a human being restarts a prebuilt workspace with a new preset. In the optimized case such a restarted prebuilt workspace would be claimed with the wrong preset.

If we're willing to tolerate this edge case, we can optimize by dropping the CTE. Otherwise, the CTE remains necessary in order to find the last set preset id for a prebuilt workspace.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'll leave the CTE for now. There is an argument to be made that once a human touches a prebuild that it's no longer valid to be claimed and as such only prebuilds withbuild_number = 1 should be claimable.

But what do we say to the God of Arguments?Not today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Prebuilt workspaces should ideally never change their preset. This means that the first build should contain a preset id and all subsequent builds should have a null preset ID.

Why do we nullify thepreset_id in subsequent builds? Conceptually, a prebuilt workspace is always associated with a preset, and clearing the field might imply the association no longer exists, which isn't quite accurate. Wouldn’t it make more sense to retain thepreset_id to reflect this persistent relationship?

The issue with this optimisation is the potentially negligent edge case where a human being restarts a prebuilt workspace with a new preset.

How would this actually happen? Is it really possible for a prebuilt workspace to be restarted with a different preset? If we consider the association to the preset immutable once the prebuilt workspace is created, this edge case shouldn’t occur, right? Or is there a specific workflow that allows this to happen?

Either way, this is not a blocker for the PR, we can definitely move forward and revisit this discussion later if needed.

Copy link
Contributor

@SasSwartSasSwartJul 9, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nullify the preset_id in subsequent builds

We don't explicitly nullify them. The fact that they are null on subsequent build is rather because there's nothing to tell a subsequent build what thepreset_id should be. We could transfer the preset id from one build to the next, but that would roughly look like running the CTE that we're discussing here on every new build. Presets are set initially and when they change by specification of the user. For normal workspaces this user is a human being who chooses a preset. For prebuilt workspaces this user is the prebuilds system user and they specify the preset in the reconciliation loop.

How would this actually happen? Is it really possible for a prebuilt workspace to be restarted with a different preset?

Any workspace, prebuilt or not can be restarted with new parameters as long as those parameters are mutable. The same goes for presets. It is only possible via API right now. Neither the UI nor the CLI support changing a preset.

This means it is theoretically possible for a user with sufficient privileges to send an API call to restart with a new preset, but it is definitely not supported.

// Log the error but continue with original results
c.logger.Error(ctx, "optimized GetRunningPrebuiltWorkspacesOptimized failed", slog.Error(err))
} else {
CompareGetRunningPrebuiltWorkspacesResults(ctx, c.logger, allRunningPrebuilds, optimized)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is a fun way to test it out 😄

WHERE workspace_builds.transition = 'start'::workspace_transition
AND workspace_builds.template_version_preset_id IS NOT NULL
ORDER BY latest_prebuilds.id, workspace_builds.build_number DESC
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

AFAIU, whenever a template version is added, new presets are inserted into the database. I don't think we ever delete presets from the database. Except maybe when a template is deleted?
A prebuild should always correlate to a non-null preset. Could this be an issue when the active template version changes? 🤔

Copy link
Contributor

@SasSwartSasSwart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You're going to have to remove the dbmem bit (🥳), but otherwise this looks good. See my comment about potentially further optimising your query at the cost of some robustness. That's non-blocking though. Since we're just logging the diffs for now.

Copy link
Contributor

@ssncferreirassncferreira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM, let's try it 🤞

Comment on lines +5102 to +5105
_ = setupFixture(t, db, "stopped-prebuild", false, database.WorkspaceTransitionStop, database.ProvisionerJobStatusSucceeded)
_ = setupFixture(t, db, "failed-prebuild", false, database.WorkspaceTransitionStart, database.ProvisionerJobStatusFailed)
_ = setupFixture(t, db, "canceled-prebuild", false, database.WorkspaceTransitionStart, database.ProvisionerJobStatusCanceled)
_ = setupFixture(t, db, "deleted-prebuild", true, database.WorkspaceTransitionStart, database.ProvisionerJobStatusSucceeded)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nice 🌟

WHERE workspace_builds.transition = 'start'::workspace_transition
AND workspace_builds.template_version_preset_id IS NOT NULL
ORDER BY latest_prebuilds.id, workspace_builds.build_number DESC
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Prebuilt workspaces should ideally never change their preset. This means that the first build should contain a preset id and all subsequent builds should have a null preset ID.

Why do we nullify thepreset_id in subsequent builds? Conceptually, a prebuilt workspace is always associated with a preset, and clearing the field might imply the association no longer exists, which isn't quite accurate. Wouldn’t it make more sense to retain thepreset_id to reflect this persistent relationship?

The issue with this optimisation is the potentially negligent edge case where a human being restarts a prebuilt workspace with a new preset.

How would this actually happen? Is it really possible for a prebuilt workspace to be restarted with a different preset? If we consider the association to the preset immutable once the prebuilt workspace is created, this edge case shouldn’t occur, right? Or is there a specific workflow that allows this to happen?

Either way, this is not a blocker for the PR, we can definitely move forward and revisit this discussion later if needed.

@johnstcnjohnstcn merged commit0367dba intomainJul 9, 2025
30 checks passed
@johnstcnjohnstcn deleted the cj/chore/GetRunningPrebuiltWorkspace-Optimize-2 branchJuly 9, 2025 10:30
@github-actionsgithub-actionsbot locked and limited conversation to collaboratorsJul 9, 2025
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.
Reviewers

@mafredrimafredrimafredri left review comments

@SasSwartSasSwartSasSwart approved these changes

Copilot code reviewCopilotCopilot left review comments

@ssncferreirassncferreirassncferreira approved these changes

Assignees

@johnstcnjohnstcn

Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

bug: GetRunningPrebuiltWorkspaces creates lots of DB load
4 participants
@johnstcn@mafredri@ssncferreira@SasSwart

[8]ページ先頭

©2009-2025 Movatter.jp