- Notifications
You must be signed in to change notification settings - Fork923
Description
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
We have started seeing the issue below since moving to v2.20.0.
TL;DR: the UI displays the agent as unhealthy after around 30 seconds but goes back to BAU after refreshing the page – details below.
Relevant Log Output
Logs attached.
Expected Behavior
No response
Steps to Reproduce
Setting CODER_DERP_FORCE_WEBSOCKETS to ‘true’ reduces some issues in the agent logs, but it does not help with the error.
The issue seems to happen only once (after the error appears and is gone, it does not show up again, at least in a short period of 10 minutes or so)
This is the difference in the workspace metadata for the agent when comparing healthy and non-healthy:
Coder Agent and UI version: 2.20.0
What Happens:
When we create a workspace, it gets provisioned quite quick and all the buttons appear and they are functional, so you can access it easily.
But then, after about 20 seconds or so, a message appears in the UI, showing that “1 agent is unhealthy”, and the action buttons disappear.
Important note is that if you quickly refresh the page, or duplicate tabs, it looks good again (as in the first screenshot), while the original browser tab still shows the unhealthy error.
I did some digging, using the API to compare the workspace’s agent health status and its logs, and at some point seems to be unhealthy for a very short period of time. 2 remarks:
After becoming healthy again, the UI still shows it as unhealthy
At the moment in time that becomes unhealthy, there are no new logs in the Agent. Please check the image below, where the left hand side shows a constant ping to the workspace API endpoint, and the right hand side is streaming the logs from the Agent container:
Environment
- Host OS: Linux
- Coder version: v2.20.0
Additional Context
The issue is new (previously worked fine)