- Notifications
You must be signed in to change notification settings - Fork924
Description
Problem Description
Coder agents sometimes fail to connect to the Coder server due to a variety of issues, including network restrictions (e.g., DNS issues, firewalls), missing permissions (e.g., CAP_NET_ADMIN), OS or architecture mismatches, and missing tools for downloading the agent binary. Currently, there’s limited guidance in the UI to help users diagnose and resolve these issues effectively, leading to delays in troubleshooting.
For example, failures in the agent bootstrap script can result in non-connecting agents without a clear indication of the root cause. When checking the workspace logs i.e.,docker logs <container name or container id>
a typical DNS failure log might look like this:
+trap waitonexit EXIT+ mktemp -d -t coder.XXXXXX+ BINARY_DIR=/tmp/coder.1uZgEp+ BINARY_NAME=coder+ BINARY_URL=https://coder.example.com:3000/bin/coder-linux-amd64+cd /tmp/coder.1uZgEp+:+ status=+command -v curl+ curl -fsSL --compressed http://coder.example.com:3000/bin/coder-linux-amd64 -o codercurl: (6) Could not resolve host: coder.example.com+ status=6+echo error: failed to download coder agent+echocommand returned: 6+echo Trying againin 30 seconds...+ sleep 30error: failed to download coder agentcommand returned: 6Trying againin 30 seconds..
Desired Solution
Implement enhanced diagnostics and UI hints that provide actionable guidance to users based on the detected issue. By giving users specific suggestions directly in the UI, they can resolve connectivity issues faster and with less frustration. This includes:
Enhanced Error Logging and Diagnostics
- Log detailed error messages for each failure point, covering:
- Network/DNS issues, with suggestions to verify DNS configuration or consult network administrators.
- Download tool availability (e.g.,
curl
orwget
), with instructions on how to install the required tool. - OS/architecture mismatches with a link to supported environments in the documentation.
- Log detailed error messages for each failure point, covering:
UI Hints for Diagnosed Issues1
- Network/DNS Issue: If a DNS or network error is detected, show a UI message like:
“It appears there’s a DNS or firewall issue preventing the agent from connecting to the server.Learn more about network configuration.” - Download Tool Missing: If the required download tools (
curl
,wget
) are unavailable, suggest a hint:
“Required download tool not found. Please install eithercurl
orwget
.” - Unsupported OS/Architecture: If OS or architecture compatibility issues arise, prompt users to check supported platforms:
“This environment may be unsupported.Review supported OS and architectures.” - Download Logs: If the agent doesn't connect, link todocs to show how to fetch agent logs outside of Coder.
- Network/DNS Issue: If a DNS or network error is detected, show a UI message like:
Proposed Implementation
- Backend Logging: Improve diagnostic logging in the agent bootstrap script to provide clearer insights into why each specific failure occurs.
- UI Updates1: Implement conditional pop-ups or error messages in the Coder UI that guide users based on diagnosed connectivity issues.
- Documentation Update: Expand documentation with a troubleshooting section that covers all major connectivity blockers, including example configurations.