Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Unhealthy Agent in Stopped Container - shut down script failed#18840

Unanswered
Discussion options

I can't quite seem to figure this one out. I have a workspace running inside a docker container on an EC2. The EC2 persists, does not terminate, just stops and starts. When I move to a stopped state everything works as expected, but in the UI I get an orange dot and it says that the agent is unhealthy, even though the workspace has stopped. I had assumed there is something where the agent inside the container isn't exiting gracefully maybe? I can't quite get it to look pretty on the UI.

When I go back into the instance after restarting from a stopped state the container logs from when the workspace starting stopping look like this:

2025-07-12 04:51:15.095 [debu]  net.tailnet.net.wgengine: wg: [v2] [2NeP2] - Receiving keepalive packet2025-07-12 04:51:19.353 [info]  stdlib: [ERR] yamux: Failed to read header: failed to get reader: failed to read frame header: EOF2025-07-12 04:51:19.353 [debu]  failed to read from protocol  error="context canceled"2025-07-12 04:51:19.353 [debu]  net.tailnet: setAllPeersLost marked peer lost  peer_id=b9465b15-3c85-4b71-b4a6-a80532cc0a36  key_id=[2NeP2]2025-07-12 04:51:19.353 [debu]  net.tailnet: peer lost timeout  peer_id=b9465b15-3c85-4b71-b4a6-a80532cc0a362025-07-12 04:51:19.354 [debu]  net.tailnet: timeout triggered for peer but it had handshake in meantime  peer_id=b9465b15-3c85-4b71-b4a6-a80532cc0a36  key_id=[2NeP2]2025-07-12 04:51:19.354 [debu]  disconnected from derp map RPC2025-07-12 04:51:19.354 [debu]  routine exited  name="derp map subscriber" ...    error= recv DERPMap error:               github.com/coder/coder/v2/agent.(*agent).runDERPMapSubscriber                   /home/runner/work/coder/coder/agent/agent.go:1683             - context canceled2025-07-12 04:51:19.354 [debu]  log sender send loop exiting2025-07-12 04:51:19.354 [debu]  swallowing context canceled  name="send logs"2025-07-12 04:51:19.354 [debu]  disconnected from coordination RPC2025-07-12 04:51:19.354 [debu]  swallowing context canceled  name=coordination2025-07-12 04:51:19.354 [debu]  swallowing context canceled  name="report lifecycle"2025-07-12 04:51:19.354 [debu]  swallowing context canceled  name="report connections"2025-07-12 04:51:19.354 [debu]  reportLoop exiting2025-07-12 04:51:19.354 [debu]  routine exited  name="stats report loop"  error=<nil>2025-07-12 04:51:19.354 [debu]  swallowing context canceled  name="fetch service banner loop"2025-07-12 04:51:19.354 [debu]  swallowing context canceled  name="report metadata"2025-07-12 04:51:19.354 [debu]  routine exited  name="app health reporter"  error=<nil>2025-07-12 04:51:19.354 [info]  connection manager errored ...    error= error in routine derp map subscriber:               github.com/coder/coder/v2/agent.(*apiConnRoutineManager).startTailnetAPI.func1                   /home/runner/work/coder/coder/agent/agent.go:2161             - recv DERPMap error:               github.com/coder/coder/v2/agent.(*agent).runDERPMapSubscriber                   /home/runner/work/coder/coder/agent/agent.go:1683             - context canceled2025-07-12 04:51:19.354 [warn]  run exited with error ...    error= error in routine derp map subscriber:               github.com/coder/coder/v2/agent.(*apiConnRoutineManager).startTailnetAPI.func1                   /home/runner/work/coder/coder/agent/agent.go:2161             - recv DERPMap error:               github.com/coder/coder/v2/agent.(*agent).runDERPMapSubscriber                   /home/runner/work/coder/coder/agent/agent.go:1683             - context canceled

followed by this a bunch times:

2025-07-12 04:51:19.517 [info]  connecting to coderd2025-07-12 04:51:19.531 [warn]  run exited with error ...    error= GET https://coder.redacted.com/api/v2/workspaceagents/me/rpc?version=2.6: unexpected status code 401: Workspace agent not authorized.: Try logging in using 'coder login'.                Error: The agent cannot authenticate until the workspace provision job has been completed. If the job is no longer running, this agent is invalid.

then I get this

2025-07-12 04:52:39.342 [info]  agent shutting down  error="context canceled"2025-07-12 04:52:39.342 [info]  shutting down agent2025-07-12 04:52:39.342 [debu]  set lifecycle state  current={"state":"shutting_down","changed_at":"2025-07-12T04:52:39.342832Z"}  last={"state":"created","changed_at":"0001-01-01T00:00:00Z"}2025-07-12 04:52:39.343 [debu]  ssh-server: closing server2025-07-12 04:52:39.343 [debu]  ssh-server: closing all active listeners  count=02025-07-12 04:52:39.343 [debu]  ssh-server: closing all active sessions  count=02025-07-12 04:52:39.343 [debu]  ssh-server: closing all active connections  count=02025-07-12 04:52:39.343 [debu]  ssh-server: closing SSH server2025-07-12 04:52:39.343 [debu]  ssh-server: waiting for all goroutines to exit2025-07-12 04:52:39.343 [debu]  ssh-server: closing server done2025-07-12 04:52:39.343 [warn]  shutdown script(s) failed ...    error= execute: not initialized:               github.com/coder/coder/v2/agent/agentscripts.(*Runner).Execute.func1                   /home/runner/work/coder/coder/agent/agentscripts/agentscripts.go:2082025-07-12 04:52:39.343 [debu]  set lifecycle state  current={"state":"shutdown_error","changed_at":"2025-07-12T04:52:39.343295Z"}  last={"state":"shutting_down","changed_at":"2025-07-12T04:52:39.342832Z"}2025-07-12 04:52:39.344 [debu]  containers: closing API2025-07-12 04:52:39.344 [debu]  containers: closed API2025-07-12 04:52:39.694 [info]  connecting to coderd2025-07-12 04:52:39.699 [warn]  run exited with error ...

Then thats followed by more 401 errors of the agent trying to connect. So i guess it tries to exit, fails, and then because it keeps pinging until the last moment the UI goes orange because it never gets a shutdown notification?

image
You must be logged in to vote

Replies: 1 comment

Comment options

nevermind... figured it out. Sorry, new to coder. I needed to change the terraform code so it destroyed the agent on stop. Nothing to do with agent logs...

You must be logged in to vote
0 replies
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Labels
None yet
1 participant
@jbnitorum

[8]ページ先頭

©2009-2025 Movatter.jp