- Notifications
You must be signed in to change notification settings - Fork948
feat(site): use websocket connection for devcontainer updates#18808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Instead of polling every 10 seconds, we instead use a WebSocketconnection for more timely updates.
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Pull Request Overview
Adds real-time devcontainer updates over WebSocket instead of polling, and updates related client and server code.
- Introduce
useAgentContainers
hook leveraging WebSockets in the frontend - Add WebSocket watch endpoints and SDK methods for container updates across site API, SDK, and agent code
- Replace polling in
AgentRow
with the new hook and update documentation and tests accordingly
Reviewed Changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
site/src/modules/resources/useAgentContainers.ts | New React hook subscribing to container updates via WS |
site/src/modules/resources/useAgentContainers.test.tsx | Tests for the hook’s initial fetch behavior |
site/src/modules/resources/AgentRow.tsx | Switched from polling to WebSocket hook |
site/src/api/api.ts | ExportwatchAgentContainers for WebSocket route |
docs/reference/api/agents.md | Add API reference for the new watch endpoint |
codersdk/workspacesdk/agentconn.go | AddWatchContainers method to agent connection SDK |
codersdk/workspaceagents.go | AddWatchWorkspaceAgentContainers to SDK client |
coderd/workspaceagents.go | Implement WebSocket handler for watch endpoint |
coderd/coderd.go | Register/containers/watch route |
coderd/apidoc/swagger.json & docs.go | Document new watch route in Swagger and docs template |
agent/agentcontainers/api.go | AddwatchContainers WS handler in agent API |
Comments suppressed due to low confidence (5)
docs/reference/api/agents.md:908
- This endpoint upgrades to a WebSocket connection rather than a plain HTTP GET; consider updating the code sample to show a WebSocket client (e.g.,
wscat
or JS) for clarity.
curl -X GET http://coder-server:8080/api/v2/workspaceagents/{workspaceagent}/containers/watch \
site/src/modules/resources/useAgentContainers.test.tsx:85
- Tests currently cover only the initial fetch; add a test that mocks WebSocket messages and verifies that
useAgentContainers
updates when a message arrives.
});
site/src/modules/resources/AgentRow.tsx:75
- The
queryClient
variable is declared but never used; consider removing this import and assignment to reduce clutter.
const queryClient = useQueryClient();
agent/agentcontainers/api.go:585
- The
slices
package is referenced here but not imported, causing a compile error. Addimport "slices"
or reference the correct package.
api.updateChans = slices.DeleteFunc(api.updateChans, func(ch chan struct{}) bool {
site/src/modules/resources/useAgentContainers.ts:28
- Calling
invalidateQueries
aftersetQueryData
will trigger an extra network request on each WebSocket message. You can remove the invalidation to prevent unnecessary refetches.
await queryClient.invalidateQueries({ queryKey });
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Backend changes LGTM 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Pointed out a few gotchas but otherwise looks good, nice work!
agent/agentcontainers/api.go Outdated
updateCh := make(chan struct{}, 1) | ||
defer func() { | ||
api.mu.Lock() | ||
close(updateCh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It'd make sense to move this to the defer below to keep the related code close to each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Addressed in6ce5c19
agent/agentcontainers/api.go Outdated
return | ||
} | ||
ctx = api.ctx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We could do this here instead:
ctx,wsNetConn:=codersdk.WebsocketNetConn(ctx,conn,websocket.MessageBinary)deferwsNetConn.Close()
Note that we're not passingapi.ctx
toWebsocketNetConn
but that's OK because of the final select (see below).
Then a few important changes following:
- Heartbeat doesn't change (needs
Ping
fromconn
) - We can omit
conn.Close
(handled viawsNetConn.Close
) - Encoder uses
wsNetConn
- Final select has one entry for
api.ctx
and another forctx
This follows practice we use elsewhere in the code-base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Is this possible?wsjson.NewEncoder
requires a*websocket.Conn
butwsNetConn
is anet.Conn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
wsjson.NewEncoder
is a tiny wrapper, i'll just write the underlying logic usingwsNetConn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Addressed in04a92a4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Oh I see, I hadn't actually seen/usedwsjson.NewEncoder
before. TheCloseRead
it calls would be ideal to retain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Have re-added this in 😄096a85e
agent/agentcontainers/api.go Outdated
@@ -583,8 +647,32 @@ func (api *API) updateContainers(ctx context.Context) error { | |||
api.mu.Lock() | |||
defer api.mu.Unlock() | |||
var knownDevcontainers map[string]codersdk.WorkspaceAgentDevcontainer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Suggestion: Naming so nobody accidentally updates this map rather thanapi.knownDevcontainers
. PerhapsknownDevcontainersBackup
or something along those lines those lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Addressed incd0c2d5
coderd/workspaceagents.go Outdated
return | ||
} | ||
ctx = api.ctx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
See related comments onapi.go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Addressed in04a92a4
codersdk/workspacesdk/agentconn.go Outdated
deferres.Body.Close() | ||
} | ||
d:=wsjson.NewDecoder[codersdk.WorkspaceAgentListContainersResponse](conn,websocket.MessageText, slog.Logger{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
What's the repercussion of passingslog.Logger
here, unstructured output on stdout?
If it's required, perhaps we should require it as aWatchContainers
parameter if we can't put it on theAgentConn
struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I completely forgot to update this when I was tinkering 🤦♀️ Will resolve this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Addressed in3e50965
socket.addEventListener("message", (event) => { | ||
if (event.parseError) { | ||
displayError( | ||
"Unable to process latest data from the server. Please try refreshing the page.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Could this message be more specific? I.e. mention what operation failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Possibly although I'm just following existing logic here. I'm not sure what context about the error exists to make this more useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I know that you are just following the existing logic, but I see this as an opportunity to do better with those messages, having in mind they are displayed to the users.
ThedisplayError
function accepts a message and a description, so to make this error better to the user I think we can make the message shorter and tell to the user what is the impact of it, instead of the techinical reason, and give it an way to possible solve the problem as you already put in the message.
displayError("Failed to update containers","Please try refreshing the page")
Wdyt?
socket.addEventListener("error", () => { | ||
displayError( | ||
"Unable to get workspace containers. Connection has been closed.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Do we just give up here? Is there precedent for trying to reconnect later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
For the frontend changes I was following logic I found elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Same as my previous comment.
displayError("Failed to load containers","Please try refreshing the page")
d.WorkspaceFolder == other.WorkspaceFolder && | ||
d.Status == other.Status && | ||
d.Dirty == other.Dirty && | ||
d.Error == other.Error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We should also check container and agent, so something like this?
d.Error== other.Error | |
d.Error==other.Error&& | |
(d.Container==nil&&other.Container==nil|| (d.Container!=nil&&other.Continer!=nil&&d.Container.ID==other.Container.ID))&& | |
(d.Agent==nil&&...d.Agent.ID==... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Addressed in001ccda
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I don't seed.Agent
checked? IMO we should propagate changes to it as well since that's how the frontend associates the sub agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Oh silly me I only read half the sentence 🤦♀️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Addressed again in971f9d6 😅
coderd/workspaceagents.go Outdated
// If the agent is unreachable, the request will hang. Assume that if we | ||
// don't get a response after 30s that the agent is unreachable. | ||
ctx, cancel := context.WithTimeout(ctx, 30*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Suggestion: call thisdialCtx
to avoid re-use. I believe we will always have a disconnect after 30s currently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Addressed in04a92a4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@DanielleMaywood I'm missing some screenshots in the PR description showing the errors being displayed since we don't have stories in Storybook for that. Could you please add that? 🙏
I left a few comments related to the error messages.
socket.addEventListener("message", (event) => { | ||
if (event.parseError) { | ||
displayError( | ||
"Unable to process latest data from the server. Please try refreshing the page.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I know that you are just following the existing logic, but I see this as an opportunity to do better with those messages, having in mind they are displayed to the users.
ThedisplayError
function accepts a message and a description, so to make this error better to the user I think we can make the message shorter and tell to the user what is the impact of it, instead of the techinical reason, and give it an way to possible solve the problem as you already put in the message.
displayError("Failed to update containers","Please try refreshing the page")
Wdyt?
socket.addEventListener("error", () => { | ||
displayError( | ||
"Unable to get workspace containers. Connection has been closed.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Same as my previous comment.
displayError("Failed to load containers","Please try refreshing the page")
expect(result.current).toBeUndefined(); | ||
}); | ||
}); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think it would be a plus to have tests to verify:
- Parsing error
- Socket error
I know how annoying it can be to have this setup working properly without causing any flakes, so I wanted to let you know in case you are curious about, but it is not a blocket at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I've tried to add those.
I just want to give a fair warning: These tests are mostly written by Claude Sonnet 4 as my frontend testing experience is lacking. Please let me know if it has made any obvious blunders 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Approving BE with minor suggestions, as per my earlier comment FE XHR might still need some love.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
DanielleMaywood commentedJul 14, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
43b0bb7
intomainUh oh!
There was an error while loading.Please reload this page.
Instead of polling every 10 seconds, we instead use a WebSocket connection for more timely updates.