- Notifications
You must be signed in to change notification settings - Fork72
chore: add validation for module source URLs#406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml index53b912b..eb3cf8b 100644 --- a/.github/workflows/ci.yaml +++b/.github/workflows/ci.yaml @@ -63,8 +63,8 @@ jobs: - name: Set up Gouses: actions/setup-go@v5 with: - go-version: "1.23.2" - - name:Validate contributors + go-version: "1.25.0" + - name: Validate Reademderun: go build ./cmd/readmevalidation && ./readmevalidation - name:Remove build file artifact run: rm ./readmevalidationSigned-off-by: Muhammad Atif Ali <me@matifali.dev>
98efa49
to898e773
Compare- Downgrade Go version in CI to 1.24 for consistency.- Fix naming and path issues in `readmevalidation` code.- Improve regex validation for module and namespace names.- Correct typos and improve comments for clarity.
898e773
to3b6b1ba
Comparewith: | ||
go-version:"1.23.2" | ||
-name:Validatecontributors | ||
go-version:"1.24" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I was having weird issues with running Golang CI locally, and had to match the Go version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Tests generated by gemini CLI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Assuming we still want to process all the blocks in a file, I think it'd be really good to have some test cases that include 2+ blocks
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is close. I think there's a couple of changes we can make the code a bit better, though
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
cmd/readmevalidation/codermodules.go Outdated
returnnamespace,moduleName,nil | ||
} | ||
funcvalidateModuleSourceURL(bodystring,filePathstring) []error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I wish I knew more about Coder templates. Do templates have asource
field (or anything equivalent) that might need to be validated, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
no its not applicable to templates
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
- Make regex more specific for registry.coder.com patterns only- Refactor to add namespace and resourceName fields to coderResourceReadme struct- Inline path parsing logic into parseCoderResourceReadme- Update validateModuleSourceURL to use struct fields instead of filePath parameter- Simplify Terraform block detection logic- Reduce nesting with early continue statements- Add comment explaining regex pattern- Extract registry.coder.com into a constant- Improve test readability with extracted variables- Remove redundant checks in tests- Replace custom contains function with strings.ContainsCo-authored-by: matifali <matifali@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We're looking better than before, but I'm still a confused by where the code is at now, and whether the current logic is what we want
cmd/readmevalidation/codermodules.go Outdated
var ( | ||
// Matches Terraform source lines with registry.coder.com URLs | ||
// Pattern: source = "registry.coder.com/namespace/module/coder" | ||
terraformSourceRe=regexp.MustCompile(`^\s*source\s*=\s*"`+registryDomain+`/([^/]+)/([^/]+)/coder"`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Do we want to swap the[^/]+
patterns for something even more specific? Right now, they allow any non-slash character, but don't we want to make sure the URLs don't have certain characters like underscores?
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Assuming we still want to process all the blocks in a file, I think it'd be really good to have some test cases that include 2+ blocks
- Use more specific regex pattern [a-zA-Z0-9-]+ instead of [^/]+ for namespace/module names- Process all Terraform blocks instead of just the first one- Report correct source if found in any block, only report incorrect sources if no correct source exists- Add comprehensive test cases for multiple Terraform blocksCo-authored-by: matifali <10648092+matifali@users.noreply.github.com>
@Parkreiner I just want to poke you on this, to see if there is anything I can help unblock here. |
@DevelopmentCats Sorry, I had just gotten back from vacation last week, so this got a bit lost in the shuffle. I'll do an extra pass, and let you know |
It's all good. I know things have been crazy busy 😃 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@DevelopmentCats I think I'm still wrapping my head around the code a little bit, but I feel pretty sure the validation logic is still off, since we're silently removing errors in some cases
ifstrings.HasPrefix(nextLine,"```tf") { | ||
isInsideTerraform=true | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is an edge case, but I feel like we want to handle cases where someone accidentally nests Terraform snippets inside each other like this:
```tf```tf
expectedSource:=registryDomain+"/"+rm.namespace+"/"+rm.resourceName+"/coder" | ||
trimmed:=strings.TrimSpace(rm.body) | ||
foundCorrectSource:=false |
ParkreinerOct 6, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Edit: I think there's something we can salvage from this comment, but I realized in the second comment in the chain that it doesn't do everything we want
I don't know if this variable makes sense. With how it's set up right now, if we have at least one correct source in the README file, we'll automatically ignore all otherincorrect sources. And from a state modeling perspective, the variable also feels redundant, when a single slice should be able to give us all the info we need
What makes more sense to me is to:
- Have
incorrectSources
as the main validation state - Instead of having the
actualSource == expectedSource
check, only have aactualSource != expectedSource
check. If that triggers, push the incorrect source to theincorrectSources
slice. If things match, we'll do nothing, and let the code fall through to the rest of the loop - Once the main loop is done and we're done processing the lines, check if the slice is not empty. If it's not, return an error with a list of all sources that are incorrect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Actually, now that I'm thinking over the code more, I think I understand what the old code was trying to do, and why my suggestion doesn't actually work
I'm not super up to speed on our Terraform conventions, but if a module has a name likemodule "blah"
, would it be expected that the source URL should always have"blah"
in it, too? If so, I think we'd need to check the name of each module block, and compare that against the source URL line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
but if a module has a name like module "blah", would it be expected that the source URL should always have "blah" in it, too?
No it can be different. We just use the same name for convenience.
}) | ||
} | ||
funcTestValidateModuleSourceURL(t*testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think the test setup is pretty good. Just to account for the other comment I added, though, I feel like we should have one more test that validates what happens if you have a block with an incorrect body, and 2+ other blocks with correct bodies
var ( | ||
validModuleBody=`# Test Module | ||
`+"```tf\n"+`module "test-module" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nit: I'm just now realizing how the sample MD contents are formatted, and I think I'd prefer for each one to be a single raw string. All the+
signs are a bit hard to read in the GitHub UI, and this is a case where all the spacing and lines do matter
// If we found incorrect sources but no correct one, report the first incorrect source | ||
iflen(incorrectSources)>0 { | ||
errs=append(errs,xerrors.Errorf("incorrect source URL format: found %q, expected %q",incorrectSources[0],expectedSource)) | ||
returnerrs | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'm not sure why we wouldn't want to report all incorrect sources?
return []error{xerrors.Errorf("invalid module path format: %s",rm.filePath)} | ||
} | ||
expectedSource:=registryDomain+"/"+rm.namespace+"/"+rm.resourceName+"/coder" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nit: since this variable isn't used for a while, we can move it down, closer to where it's used
Uh oh!
There was an error while loading.Please reload this page.
Would prevents issues like the one I fixed in#404
This change should have caught that error as