- Notifications
You must be signed in to change notification settings - Fork928
fix: fix TestPendingUpdatesMetric flaky assertion#14534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
This stack of pull requests is managed by Graphite.Learn more about stacking. Join@spikecurtis and the rest of your teammates on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM, thanks for fixing this@spikecurtis!
There's a minor change regardingdbtime
to be made before merging.
func newNotifier(cfg codersdk.NotificationsConfig, id uuid.UUID, log slog.Logger, db Store, | ||
hr map[database.NotificationMethod]Handler, metrics *Metrics, clock quartz.Clock, | ||
) *notifier { | ||
tick := clock.NewTicker(cfg.FetchInterval.Value(), "notifier", "fetchInterval") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I love these ticker tags ❤️
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
0747bb8
to3ae3fbb
Compare0eca1fc
intomainUh oh!
There was an error while loading.Please reload this page.
Merge activity
|
Uh oh!
There was an error while loading.Please reload this page.
Fixes flake seen here:
https://github.com/coder/coder/actions/runs/10677025332/job/29591518118
The original test waits for DB calls to update success and failure of notifications, and blocks on a
pause
channel. However, it uses a single pause channel for both DB calls, and so implicitly assumes that both success and failure updates occur during an update sync.However, the update sync timer isn't synchronized to anything, and so there is a race where the update sync only has either the success or the failure result but not both. This blocks the test, which waits for both, and deadlocks before unpausing.
It appears the unpause mechanism is there to test that the
PendingUpdates
metric updates accordingly.This fixes the flake by:
Manager
syncs updates.PendingUpdates
metric to reach 2, so that we know that both success and failure have been queued upThis has the nice property that a "pause" function is no longer required: we know the manager is in our desired state before we trigger the update, and can assert on the Metrics before and after.