- Notifications
You must be signed in to change notification settings - Fork1k
feat(coderd): add provisioner_daemons to /debug/health endpoint#11393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
1b2bc9a
to53ed901
Compare53ed901
tob83013b
CompareThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Do you understand why this keeps getting marked as changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think it's the timestamp on the files changing; I haven't yet figured out a way to ignore this.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
continue | ||
} | ||
// Daemon has gone away, skip. | ||
ifnow.Sub(daemon.LastSeenAt.Time)> (opts.StaleInterval) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thinking out loud, is it possible that a daemon reports an error, but after that, the last seen isn't updated. Then after stale interval the error disappears, and perhaps nobody ever notices it?
Also, don't we want to apply the same rules tor.ProvisionerDaemons
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
is it possible that a daemon reports an error, but after that, the last seen isn't updated. Then after stale interval the error disappears, and perhaps nobody ever notices it?
Yes, and this is by design. If a provisioner daemon connects at some point, has a transient error, and then never heartbeats, I don't think it makes sense to warn about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
That's fair. I was entertaining the possibility of some error state resulting in the provisioner exiting, crashing or plain stopping communication with server. But I didn't have anything concrete in mind, so this is fine.
Uh oh!
There was an error while loading.Please reload this page.
ProvisionerDaemons []codersdk.ProvisionerDaemon`json:"provisioner_daemons"` | ||
} | ||
typeProvisionerDaemonsReportDepsstruct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This feels more likeConfig
instead ofDeps
, but I'll leave it up to you.
Uh oh!
There was an error while loading.Please reload this page.
62d42e8
to19e896a
Compare@johnstcn Nice contribution! Could you please link it with the umbrella issue and fill in the PR description? |
Uh oh!
There was an error while loading.Please reload this page.
Adds a healthcheck for provisioner daemons to /debug/health endpoint.
Part of#10676