Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat: add workspace build timing metrics#15771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

kevinh-canva
Copy link
Contributor

@kevinh-canvakevinh-canva commentedDec 6, 2024
edited
Loading

Context

We want to place a tight SLO around coder workspace build times, so we can detect regression. However, buffy GPU instances often take a much longer time to start/stop than general instance, which frequently triggered our SLO alerts, even though it's only because of a few (expected) slow GPU builds. This is caused by the metrics we are usingcoderd_provisionerd_job_timing_seconds not having a dimension for template name (as we have a separate template for GPU and another for general instances).

Looking closer at the code, this metrics is also not the correct one to use either, because a Job can actually be many different things, not just a workspace build.

Intent

This PR introduces a new prometheus metrics forworkspace_build_timing_seconds, which specifically reports workspace build times. To reduce cardinality, this metrics excludesworkspace_name andworkspace_owner that are present on theworkspace_builds_total metrics.

This'd allow us to have different (and tight) SLOs for each of our template (GPU vs non-GPU) by filtering on thetemplate_name (optionallytemplate_version tag) as well as the workspace transition (as we noticedstop is often slower thanstart, but users don't care a lot aboutstop transitions).

@cdr-botcdr-botbot added the communityPull Requests and issues created by the community. labelDec 6, 2024
@kevinh-canvakevinh-canva changed the titleAdd workspace build timing metricsfeat(metrics): Add workspace build timing metricsDec 6, 2024
@kevinh-canvakevinh-canva changed the titlefeat(metrics): Add workspace build timing metricsfeat(metrics): add workspace build timing metricsDec 6, 2024
@kevinh-canvakevinh-canva changed the titlefeat(metrics): add workspace build timing metricsfeat(runner): add workspace build timing metricsDec 6, 2024
@kevinh-canvakevinh-canva changed the titlefeat(runner): add workspace build timing metricsfeat: add workspace build timing metricsDec 6, 2024
@kevinh-canva
Copy link
ContributorAuthor

I have read the CLA Document and I hereby sign the CLA

@dannykoppingdannykopping self-requested a reviewDecember 6, 2024 08:37
Copy link
Contributor

@dannykoppingdannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for your contribution@kevinh-canva!

Let's see if we can find a solution to the potential cardinality explosion.

Copy link
Contributor

@dannykoppingdannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM

@kylecarbs could you please force-merge this PR?
The two failing CI jobs are both related to forks not being able to access secrets.

@dannykopping
Copy link
Contributor

dannykopping commentedDec 9, 2024
edited
Loading

I have read the CLA Document and I hereby sign the CLA

@kevinh-canva would you mind commenting this again? The CLA step should now be fixed.
Please also rebase onmain before doing so.

@kevinh-canva
Copy link
ContributorAuthor

I have read the CLA Document and I hereby sign the CLA

@kevinh-canva
Copy link
ContributorAuthor

@dannykopping Looks like it's still failing for me somehow

@dannykoppingdannykoppingenabled auto-merge (squash)December 11, 2024 05:16
deansheather added a commit to coder/cla that referenced this pull requestDec 11, 2024
@deansheather
Copy link
Member

Sorry about the issues with the CLA, we've been having some troubles lately with secrets in our GitHub actions workflows

dannykopping added a commit to coder/cla that referenced this pull requestDec 11, 2024
Manually adding this since our CLA bot is broken
@dannykoppingdannykopping merged commitc528791 intocoder:mainDec 11, 2024
29 checks passed
@github-actionsgithub-actionsbot locked and limited conversation to collaboratorsDec 11, 2024
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.
Reviewers

@dannykoppingdannykoppingdannykopping approved these changes

Assignees

@kevinh-canvakevinh-canva

Labels
communityPull Requests and issues created by the community.
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@kevinh-canva@dannykopping@deansheather@matifali

[8]ページ先頭

©2009-2025 Movatter.jp