- Notifications
You must be signed in to change notification settings - Fork1k
Open
Description
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
One of our users has connection issues with jetbrains gateway. He can work fine for a while but at random moments, all SSH connections to his Workspace break up. Even the web terminal does not work anymore.
I think the coder agent in the workspace exited.
Jetbrains Gateway fails with the message: the workspace "xyz" does not have an agent with ID "abc"

I found the following connection reporting errors in the/tmp/coder-agent.log
(addresses are changed)
Relevant Log Output
2025-10-07 08:38:23.121 [info] ssh-server: started serving ssh connection remote_addr=[0000:0000:0000:0000:0000:0000:0000:0007]:62018 local_addr=[0000:0000:0000:0000:0000:0000:0000:0002]:1 listen_addr={}2025-10-07 08:38:23.197 [info] ssh-server: handling ssh session remote_addr=[0000:0000:0000:0000:0000:0000:0000:0007]:62018 local_addr=[0000:0000:0000:0000:0000:0000:0000:0002]:1 id=d047e00d-b820-412f-a4ab-925a161735572025-10-07 08:38:23.198 [debu] reporting connection payload="connection:{id:\"\\x12\\x34\\x56^\\x78=D\\x90\\x12\\x13\\xb3ȁjGT\" action:CONNECT type:JETBRAINS timestamp:{seconds:1759822985 nanos:540587383} ip:\"localhost\"}"2025-10-07 08:38:23.208 [debu] routine exited name="report connections" ... error= failed to report connection: github.com/coder/coder/v2/agent.(*agent).reportConnectionsLoop /home/runner/work/coder/coder/agent/agent.go:793 -export connection log: 1 error occurred:* pq: null valuein column"ip" of relation"connection_logs" violates not-null constraint storj.io/drpc/drpcwire.UnmarshalError:26 storj.io/drpc/drpcstream.(*Stream).HandlePacket:224 storj.io/drpc/drpcmanager.(*Manager).manageReader:2472025-10-07 08:38:23.209 [debu] reportLoop exiting2025-10-07 08:38:23.209 [debu] routine exited name="stats report loop" error=<nil>2025-10-07 08:38:23.209 [debu] sent disconnect2025-10-07 08:38:23.209 [debu] swallowing context canceled name="report lifecycle"2025-10-07 08:38:23.209 [debu] routine exited name="app health reporter" error=<nil>2025-10-07 08:38:23.209 [debu] log sender send loop exiting2025-10-07 08:38:23.209 [debu] swallowing context canceled name="report metadata"2025-10-07 08:38:23.209 [debu] swallowing context canceled name="fetch service banner loop"2025-10-07 08:38:23.209 [debu] swallowing context canceled name="send logs"2025-10-07 08:38:23.209 [debu] disconnected from derp map RPC2025-10-07 08:38:23.209 [debu] swallowing context canceled name="derp map subscriber"2025-10-07 08:38:23.211 [debu] failed toread from protocol error=EOF2025-10-07 08:38:23.211 [debu] net.tailnet: setAllPeersLost marked peer lost peer_id=375ea62d-7689-4194-a3b2-63b94b1bf3a4 key_id=[9LPVE]2025-10-07 08:38:23.211 [debu] net.tailnet: setAllPeersLost marked peer lost peer_id=32843b5c-d0e8-4bbf-af43-1db9c7bdddb6 key_id=[HjhSH]2025-10-07 08:38:23.211 [debu] net.tailnet: setAllPeersLost marked peer lost peer_id=177dd712-c147-4e2a-80f6-aee5834bc1ad key_id=[eMQVk]2025-10-07 08:38:23.211 [debu] net.tailnet: setAllPeersLost marked peer lost peer_id=550d89ca-f589-463b-aae1-537bc8a2a356 key_id=[LBF/Z]2025-10-07 08:38:23.211 [debu] net.tailnet: setAllPeersLost marked peer lost peer_id=ae503e94-53dd-4b17-9356-6a14a5a3e1f9 key_id=[Z5XuH]2025-10-07 08:38:23.211 [debu] net.tailnet: peer lost timeout peer_id=ae503e94-53dd-4b17-9356-6a14a5a3e1f92025-10-07 08:38:23.211 [debu] net.tailnet: setAllPeersLost marked peer lost peer_id=d164092d-ede3-4184-a9b6-d58141126c48 key_id=[utirz]2025-10-07 08:38:23.211 [debu] responses closed after disconnect2025-10-07 08:38:23.211 [debu] disconnected from coordination RPC2025-10-07 08:38:23.211 [debu] routine exited name=coordination error=<nil>2025-10-07 08:38:23.211 [debu] net.tailnet: timeout triggeredforpeer but it had handshakein meantime peer_id=ae503e94-53dd-4b17-9356-6a14a5a3e1f9 key_id=[Z5XuH]2025-10-07 08:38:23.211 [info] connection manager errored ... error= errorin routine report connections: github.com/coder/coder/v2/agent.(*apiConnRoutineManager).startAgentAPI.func1 /home/runner/work/coder/coder/agent/agent.go:2119 - failed to report connection: github.com/coder/coder/v2/agent.(*agent).reportConnectionsLoop /home/runner/work/coder/coder/agent/agent.go:793 -export connection log: 1 error occurred:* pq: null valuein column"ip" of relation"connection_logs" violates not-null constraint storj.io/drpc/drpcwire.UnmarshalError:26 storj.io/drpc/drpcstream.(*Stream).HandlePacket:224 storj.io/drpc/drpcmanager.(*Manager).manageReader:2472025-10-07 08:38:23.211 [info] stdlib: [ERR] yamux: Failed toread header: failed to get reader: context canceled2025-10-07 08:38:23.211 [warn] run exited with error ... error= errorin routine report connections: github.com/coder/coder/v2/agent.(*apiConnRoutineManager).startAgentAPI.func1 /home/runner/work/coder/coder/agent/agent.go:2119 - failed to report connection: github.com/coder/coder/v2/agent.(*agent).reportConnectionsLoop /home/runner/work/coder/coder/agent/agent.go:793 -export connection log: 1 error occurred:* pq: null valuein column"ip" of relation"connection_logs" violates not-null constraint storj.io/drpc/drpcwire.UnmarshalError:26 storj.io/drpc/drpcstream.(*Stream).HandlePacket:224 storj.io/drpc/drpcmanager.(*Manager).manageReader:2472025-10-07 08:38:23.636 [info] connecting to coderd
Expected Behavior
Stable SSH connections. No errors about reporting Jetbrains connections.
Steps to Reproduce
Not sure, if reproducible:
- start workspace
- use Jebtrains Gateway coder button (coder module)
- waiting for random ssh exits
Environment
- User PC OS:
Windows 11
- Workspace OS:
Ubuntu 24.04.3 LTS
on GCloud VM - Coder version:
v2.26.0
and after update tov2.26.1
Additional Context
I have tested this on the latest version, The issue is new (previously worked fine)