- Notifications
You must be signed in to change notification settings - Fork907
feat: fetch prebuilds metrics state in background#17792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
5da546e
toe73dae6
CompareSigned-off-by: Danny Kopping <dannykopping@gmail.com>
e73dae6
tofcbfb7f
CompareUh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
@@ -55,20 +57,34 @@ var ( | |||
labels, | |||
nil, | |||
) | |||
lastUpdateDesc = prometheus.NewDesc( | |||
"coderd_prebuilt_workspaces_metrics_last_updated", | |||
"The unix timestamp when the metrics related to prebuilt workspaces were last updated; these metrics are cached.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Is unix timestamp easy to alert on? Like can you do something likeunix_now() - metric_value > 1000
or something in grafana and co? If not, it might be better if this was a durationsince the last successful fetch instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
+1 from me for duration since last successful fetch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The idiomatic approach is to use unix timestamps, seeprometheus_config_last_reload_success_timestamp_seconds
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
So I guess we have an existing metric for the coder server start timestamp?
dannykoppingMay 13, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I don't think so (or at least not one we export), but I think as long as this metric is updated relative to itselfandup
is taken into consideration, it should be useful.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
b2a1de9
intomainUh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Collect()
is called whenever the/metrics
endpoint is hit to retrieve metrics.The queries used in prebuilds metrics collection are quite heavy, and we want to avoid having them running concurrently / too often to keep db load down.
Here I'm moving towards a background retrieval of the state required to set the metrics, which gets invalidated every interval.
Also introduces
coderd_prebuilt_workspaces_metrics_last_updated
which operators can use to determine when these metrics go stale.See#17789 as well.