- Notifications
You must be signed in to change notification settings - Fork927
chore: track terraform modules in telemetry#15450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
d406ef6
to67795fc
Comparewooo!!! |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
provisioner/terraform/provision.go Outdated
if err != nil { | ||
return provisionersdk.PlanErrorf("get modules: %s", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
So, if we fail to get the list of modules for the build that weonly need for telemetry purposes, we fail to build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Good catch, it'd probably be better to only log an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
On second thought, after the discussion regarding using modules in places other than telemetry, it may make sense to return an error rather than just log. Otherwise any code using module info would have to take into account that it may be missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Just so we're clear, this means that any failure to fetch the modules in the working directory will result in a failure of the whole plan stage.
Right now, as far as I can tell, the only scenario that will trigger this is a malformed modules file. Are there any other possible scenarios I'm missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Pair review (blocking): as this is currently only for telemetry, we shouldnot fail to plan if anything goes wrong with extracting modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Are there any other possible scenarios I'm missing?
I don't think there are any other scenarios - barring disk read failures and such. In normal operation this should only happen if terraform changes how it stores modules on disk and we upgrade to that version.
CREATE TABLE workspace_modules ( | ||
id uuid NOT NULL, | ||
job_id uuid NOT NULL REFERENCES provisioner_jobs (id) ON DELETE CASCADE, | ||
transition workspace_transition NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Do we need to capture the module for all transitions? What's the value of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Imirroredworkspace_resources
. I think for template import jobs you get resources that reference the same provisioner job but have different workspace transitions. My thinking was if you'd like to associate a specific resource from a template import job with its module, you'd need to know which transition it came from.
Uh oh!
There was an error while loading.Please reload this page.
@@ -0,0 +1,64 @@ | |||
package terraform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'm curious about this.
We extract all other resources from the template by executingterraform graph
(seeprovisioner/terraform/executor.go
->graph()
) and then pass that toprovisioner/terraform/resources.go
->ConvertState()
. From there we retrieve all the template's resources, and they're then persisted.
Did this approach not work for modules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I couldn't find either module versions or their sources inside the state. This was the first approach I tried, and I switched to reading files only when I couldn't make it work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is the output from aterraform graph
run locally with a module specified:
$ terraform graphdigraph G { ..."module.jetbrains_gateway.coder_app.gateway" ->"coder_agent.main";"module.jetbrains_gateway.coder_app.gateway" ->"module.jetbrains_gateway.data.coder_parameter.jetbrains_ide";"module.jetbrains_gateway.coder_app.gateway" ->"module.jetbrains_gateway.data.coder_workspace.me";"module.jetbrains_gateway.coder_app.gateway" ->"module.jetbrains_gateway.data.coder_workspace_owner.me";"module.jetbrains_gateway.coder_app.gateway" ->"module.jetbrains_gateway.data.http.jetbrains_ide_versions";}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yeah, the graph only displays the module names specified by the user in the template. It doesn’t show the source, likeregistry.coder.com/modules/jetbrains-gateway/coder
, or the version, like1.0.23
. That information is available in the files on disk.
Uh oh!
There was an error while loading.Please reload this page.
Thought: I can imagine a valid use-case for storing module data separate to telemetry -- showing in the UI which module a resource came from is probably useful from a template editor / admin perspective when trying to answer the question "Which module did this resource come from?" |
Uh oh!
There was an error while loading.Please reload this page.
@johnstcn I addressed your comments:
|
175c10e
tob72b8ef
CompareThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I have a few more comments and issues I'd like to see clarified before merging this.
provisionerd/runner/runner.go Outdated
@@ -1033,6 +1038,7 @@ func (r *Runner) runWorkspaceBuild(ctx context.Context) (*proto.CompletedJob, *p | |||
State: applyComplete.State, | |||
Resources: applyComplete.Resources, | |||
Timings: applyComplete.Timings, | |||
Modules: planComplete.Modules, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Why are we using modules from the plan response here? Will these be different from the apply response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The module files on disk are created byterraform init
.init
is called only byplan
here, so modules at the time of apply should never be different from those at the time of plan. I admit this is highly non-obvious. I'll add a comment about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks, a comment would definitely help. From someone else's perspective, this simply looks like a typo.
provisioner/terraform/resources.go Outdated
// the database, a null value in WorkspaceResource's ModulePath | ||
// indicates "this resource was created before module paths | ||
// were tracked." | ||
modulePath = "FAILED_TO_PARSE_TERRAFORM_ADDRESS" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Could we extract this to a constant we can refer to elsewhere?
EDIT: it might actually be even better as a sentinel errorErrInvalidTerraformAddr
Files: map[string]string{ | ||
"main.tf": `module "hello" { | ||
source = "./module" | ||
}`, | ||
"module/module.tf": ` | ||
resource "null_resource" "example" {} | ||
module "there" { | ||
source = "./inner_module" | ||
} | ||
`, | ||
"module/inner_module/inner_module.tf": ` | ||
resource "null_resource" "inner_example" {} | ||
`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can we add a test with a malformed or otherwise invalid module?
Uh oh!
There was an error while loading.Please reload this page.
provisioner/terraform/provision.go Outdated
if err != nil { | ||
return provisionersdk.PlanErrorf("get modules: %s", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Just so we're clear, this means that any failure to fetch the modules in the working directory will result in a failure of the whole plan stage.
Right now, as far as I can tell, the only scenario that will trigger this is a malformed modules file. Are there any other possible scenarios I'm missing?
Uh oh!
There was an error while loading.Please reload this page.
err = InsertWorkspaceModule(ctx, s.Database, jobID, transition, module, telemetrySnapshot) | ||
if err != nil { | ||
return nil, xerrors.Errorf("insert module: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
nit: you can combine assignment and conditional here:
err=InsertWorkspaceModule(ctx,s.Database,jobID,transition,module,telemetrySnapshot) | |
iferr!=nil { | |
returnnil,xerrors.Errorf("insert module: %w",err) | |
} | |
iferr:=InsertWorkspaceModule(ctx,s.Database,jobID,transition,module,telemetrySnapshot);err!=nil { | |
returnnil,xerrors.Errorf("insert module: %w",err) | |
} |
Uh oh!
There was an error while loading.Please reload this page.
@@ -2666,6 +2666,20 @@ func (q *querier) GetWorkspaceByWorkspaceAppID(ctx context.Context, workspaceApp | |||
return fetch(q.log, q.auth, q.db.GetWorkspaceByWorkspaceAppID)(ctx, workspaceAppID) | |||
} | |||
func (q *querier) GetWorkspaceModulesByJobID(ctx context.Context, jobID uuid.UUID) ([]database.WorkspaceModule, error) { | |||
if err := q.authorizeContext(ctx, policy.ActionRead, rbac.ResourceSystem); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
note:rbac.ResourceSystem
is fine just for telemetry, but we'll probably want to create a separate RBAC resource in future for some of the other use-cases here. Can you create a follow-up issue for this and add a comment referencing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Requesting changes based on pair review with@dannykopping.
The major worry we have is that there is thepossibility that a failure to read the modules on-disk will result in a failure toPlan()
. While a malformed or otherwise problematicmodules.json
willprobably cause Terraform to bail, we shouldn't.
provisioner/terraform/provision.go Outdated
if err != nil { | ||
return provisionersdk.PlanErrorf("get modules: %s", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Pair review (blocking): as this is currently only for telemetry, we shouldnot fail to plan if anything goes wrong with extracting modules.
coderd/telemetry/telemetry_test.go Outdated
@@ -119,6 +123,28 @@ func TestTelemetry(t *testing.T) { | |||
require.Len(t, snapshot.Users, 1) | |||
require.Equal(t, snapshot.Users[0].EmailHashed, "bb44bf07cf9a2db0554bba63a03d822c927deae77df101874496df5a6a3e896d@coder.com") | |||
}) | |||
t.Run("HashedModule", func(t *testing.T) { | |||
t.Parallel() | |||
db := dbmem.New() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can we useddbtestutil.NewDB
here instead? Also, less work for you later when dbmem goes away ;-)
coderd/telemetry/telemetry.go Outdated
@@ -661,6 +676,26 @@ func ConvertWorkspaceResourceMetadata(metadata database.WorkspaceResourceMetadat | |||
} | |||
} | |||
func ConvertWorkspaceModule(module database.WorkspaceModule) WorkspaceModule { | |||
isCoderModule := strings.Contains(module.Source, "registry.coder.com") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We should extract this to a top-level function. If we have multiple registries in the future we may need to add to thisexclusion inclusion list.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
job_id = $1; | ||
-- name: GetWorkspaceModulesCreatedAfter :many | ||
SELECT * FROM workspace_modules WHERE created_at > $1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
non-blocking: If we're going to be querying oncreated_at
, do we need an index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Won't hurt, I'll add it. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
b72b8ef
to780730f
Compare@johnstcn@dannykopping I made changes based on your feedback. List of changesEach point references the source comment.
|
writer := tar.NewWriter(&buffer) | ||
addedDirs := make(map[string]bool) | ||
for name, content := range files { | ||
// Add parent directories if they don't exist | ||
dir := filepath.Dir(name) | ||
if dir != "." && !addedDirs[dir] { | ||
err := writer.WriteHeader(&tar.Header{ | ||
Name: dir + "/", // Directory names must end with / | ||
Mode: 0o755, | ||
Typeflag: tar.TypeDir, | ||
}) | ||
require.NoError(t, err) | ||
addedDirs[dir] = true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I found many instances of this function sprinkled around our codebase so I added a helper function for this in#15540
I'll wait for this PR to be merged and then update these tests to usetestutil.CreateTar
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks for working on this!
aa0dc2d
intomainUh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Addresseshttps://github.com/coder/nexus/issues/35.
This PR:
workspace_modules
table to track modules used by the Terraform provisioner in provisioner jobs.module_path
column to theworkspace_resources
table, allowing to identify which module a resource originates from.For the person reviewing this PR, do not fret about the 1,500 new lines - ~1,000 of them are auto-generated.