NotificationsYou must be signed in to change notification settings
Fork905
Star10k

feat(agent/agentcontainers): recreate devcontainers concurrently#18042

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

mafredri merged 2 commits intomainfrommafredri/feat-agent-agentcontainers-recreate-async

May 26, 2025

Merged

feat(agent/agentcontainers): recreate devcontainers concurrently#18042

mafredri merged 2 commits intomainfrommafredri/feat-agent-agentcontainers-recreate-async

May 26, 2025

+414 −157

Conversation

Copy link

Member

mafredri commentedMay 26, 2025

This change introduces a refactor of the devcontainers recreation logic
which is now handled asynchronously rather than being request scoped.
The response was consequently changed from "No Content" to "Accepted" to
reflect this.

A newStatus field was introduced to the devcontainer struct which
replacesRunning (bool). This reflects that the devcontainer can now
be in various states (starting, running, stopped or errored).

The status field also protects against multiple concurrent recrations,
as long as they are initiated via the API.

Updates#16424

github-actionsbot assignedmafredri

May 26, 2025

mafredri force-pushed themafredri/feat-agent-agentcontainers-recreate-async branch 4 times, most recently fromd17bca2 to0ebc262Compare

May 26, 2025 12:02

feat(agent/agentcontainers): recreate devcontainers concurrently

9328228

This change introduces a refactor of the devcontainers recreation logicwhich is now handled asynchronously rather than being request scoped.The response was consequently changed from "No Content" to "Accepted" toreflect this.A new `Status` field was introduced to the devcontainer struct whichreplaces `Running` (bool). This reflects that the devcontainer can nowbe in various states (starting, running, stopped or errored).The status field also protects against multiple concurrent recrations,as long as they are initiated via the API.Updates#16424

mafredri force-pushed themafredri/feat-agent-agentcontainers-recreate-async branch from0ebc262 to9328228Compare

May 26, 2025 12:02

mafredri requested a review fromCopilot

May 26, 2025 12:04

This comment was marked as resolved.

appease linter and standardize slog fields

78b0159

mafredri requested review fromjohnstcn andDanielleMaywood

May 26, 2025 12:24

johnstcn reviewed

May 26, 2025

View reviewed changes

coderd/workspaceagents.go

		// @Param workspaceagent path string true "Workspace agent ID" format(uuid)
		// @Param container path string true "Container ID or name"
		// @Success204
		// @Success202 {object} codersdk.Response

Copy link

Member

johnstcnMay 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Note: the response code change here does not appear to affect any endpoints currently in use by the FE.

Copy link

MemberAuthor

mafredriMay 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Correct, not yet.

codersdk/workspaceagents.goShow resolvedHide resolved

agent/agentcontainers/api_test.go

		err = json.NewDecoder(rec.Body).Decode(&resp)
		require.NoError(t, err, "unmarshal response failed after recreation")
		require.Len(t, resp.Devcontainers, 1, "expected one devcontainer in response after recreation")
		assert.Equal(t, codersdk.WorkspaceAgentDevcontainerStatusRunning, resp.Devcontainers[0].Status, "devcontainer is not stopped after recreation")

Copy link

Member

johnstcnMay 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nit: this could also be a require?

agent/agentcontainers/api_test.go

		err = json.NewDecoder(rec.Body).Decode(&resp)
		require.NoError(t, err, "unmarshal response failed after error")
		require.Len(t, resp.Devcontainers, 1, "expected one devcontainer in response after error")
		assert.Equal(t, codersdk.WorkspaceAgentDevcontainerStatusError, resp.Devcontainers[0].Status, "devcontainer is not in an error state after up failure")

Copy link

Member

johnstcnMay 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nit: this could also be a require?

Copy link

MemberAuthor

mafredriMay 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

To what benefit? Assert is useful in that it can allow multiple conditions to fail and give you a larger picture of what's wrong. Require is ofc needed whenever you need that condition to be true, like nil and lengths checks so that the rest of the code doesn't panic.

Copy link

Member

johnstcnMay 26, 2025•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

My nit is very much stylistic here; having anassert on the last line below one or morerequire doesn't help us against panics or any of the above you mentioned. Feel free to ignore!

agent/agentcontainers/api.go

		// The devcontainer state must be set to starting and the recreateWg must be
		// incremented before calling this function.
		func (api *API) recreateDevcontainer(dc codersdk.WorkspaceAgentDevcontainer, configPath string) {
		defer api.recreateWg.Done()

Copy link

Member

johnstcnMay 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm nervous that a hanging request torecreateDevcontainer could make closing the devcontainers API hang onapi.wg.Wait(). I wonder if it makes sense to have a 'graceful' shutdown context and a 'hard' shutdown context that will forcefully cancel all in-flight recreation waitgroups after a certain period of time elapses?