- Notifications
You must be signed in to change notification settings - Fork907
fix: reduce excessive logging when database is unreachable#17363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
provisionerd/provisionerd.go Outdated
p.acquireAndRunOne(client) | ||
err := p.acquireAndRunOne(client) | ||
if err != nil && ctx.Err() == nil { // Only log if context is not done. | ||
p.opts.Logger.Debug(ctx, "retrying to acquire job", slog.F("retry_in_ms", retrier.Delay.Milliseconds()), slog.Error(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Self-review:acquireAndRunOne
already logs its own warning - specifically theprovisionerd was unable to acquire job
one is logged when the db is unreachable - soDebug
is what felt most appropriate to me.
…ailnet control protocol dialerSigned-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
coderd/workspaceagents_test.go Outdated
// This needs to be done *after* the server "starts" otherwise it'll fail straight away when trying to initialize. | ||
pdb.MarkUnhealthy() | ||
// Then: the tailnet controller will continually try to dial the coordination endpoint, exceeding its context timeout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This comment is wrong, we don't continually retry becauseDialAgent
only waits until we hit a dial error. Once the first error is returned the test is complete and we tear down the context.
Furthermore, I don't think the SDKDialAgent
is really the thing that you care about testing here. It doesn't handle the retries anyways,tailnet
does. Maybe simplify this and just use theWebsocketDialer
and ensure it returns an error.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
This has a downside of losing the details of the received error, but in this case it seems justified since we need to conditionalize responses based on codersdk.ErrDatabaseNotReachableSigned-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
0b18e45
intomainUh oh!
There was an error while loading.Please reload this page.
/cherry-pick release/2.21 |
/cherry-pick release/2.20 |
Fixes#17045