Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add performance tests#1868

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
alexeyr-ci2 wants to merge38 commits intomaster
base:master
Choose a base branch
Loading
fromalexeyr/performance-tests
Draft

Conversation

@alexeyr-ci2
Copy link
Collaborator

@alexeyr-ci2alexeyr-ci2 commentedOct 20, 2025
edited by justin808
Loading

Summary

Add performance tests.Fixes#1777.

Pull Request checklist

  • [ ] Add/update test to cover these changes
  • Update documentation
  • Update CHANGELOG file

This change is Reviewable

@coderabbitai
Copy link
Contributor

coderabbitaibot commentedOct 20, 2025
edited
Loading

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the.coderabbit.yaml file in this repository. To trigger a single review, invoke the@coderabbitai review command.

You can disable this status message by setting thereviews.review_status tofalse in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branchalexeyr/performance-tests

Thanks for usingCodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment@coderabbitai help to get the list of available commands and usage tips.

@alexeyr-ci2alexeyr-ci2 changed the titleAlexeyr/performance testsAdd performance testsOct 20, 2025
@alexeyr-cialexeyr-ciforce-pushed thealexeyr/performance-tests branch from980a424 to4a01be4CompareOctober 20, 2025 19:01
@alexeyr-ci2alexeyr-ci2force-pushed thealexeyr/performance-tests branch 5 times, most recently fromec26e73 toa616cd1CompareNovember 5, 2025 11:54
@claude
Copy link

claudebot commentedNov 5, 2025

Code Review - PR#1868: Add Performance Tests

Thanks for this contribution! This is a solid implementation of performance testing infrastructure.


✅ Strengths

  1. Comprehensive Tool Coverage: Using three industry-standard tools (Fortio, Vegeta, k6) provides good cross-validation
  2. Well-Structured Workflow: Clear separation of concerns with logical step grouping
  3. Good Error Handling: Proper use of set -euo pipefail and validation steps
  4. Caching Strategy: Smart use of GitHub Actions cache for benchmark tool binaries
  5. Security Conscious: SSH access properly gated with warnings and actor-only access
  6. Flexible Configuration: Parameterized inputs allow customization

🐛 Critical Issues

  1. Node.js Version Mismatch (benchmark.yml:155)

  2. Missing Trailing Newlines (CRITICAL per CLAUDE.md)

    • Files spec/dummy/bin/prod and spec/dummy/bin/prod-assets missing final newlines
    • This will cause CI failures - Run: rake autofix to fix
  3. Race Condition in Server Startup (benchmark.yml:220)

    • Background server might not be fully initialized when curl check succeeds
    • Recommend adding health check endpoint or verifying multiple successful responses
  4. Incomplete PR Checklist

    • Documentation and CHANGELOG boxes unchecked
    • Per CLAUDE.md: Should update CHANGELOG for user-visible features

⚠️ Performance & Quality Issues

  1. bench.sh:66 - Unrealistic iteration count assumes 10 iterations/sec per VU
  2. Fortio Optimization Missing - Referenced guide not implemented
  3. No Resource Monitoring - Missing CPU/memory metrics
  4. Missing Test Validation - No thresholds to detect performance regressions
  5. Single Route Testing - Only tests server_side_hello_world_hooks
  6. Complex Validation Logic - Lines 17-27 hard to read
  7. Inconsistent Quoting - Variables not consistently quoted
  8. Hard-coded Timeouts - 30s and 60s timeouts not configurable

📚 Documentation Issues

  1. Missing Documentation - No README for running benchmarks locally
  2. Workflow Triggers - Runs on ALL PRs (expensive, should use labels/manual only)
  3. No Historical Tracking - Results not compared over time

🎯 Before Merging

  1. Fix Node.js version (22 → 20)
  2. Add trailing newlines
  3. Run bundle exec rubocop
  4. Update CHANGELOG.md
  5. Add performance thresholds
  6. Update workflow triggers

📊 Summary

Assessment: Good foundation but needs critical fixes before merge

Effort: Critical issues ~30 mins, High priority ~1-2 hours

Main concerns: Node.js version mismatch and missing trailing newlines will cause CI failures.

Let me know if you need help with any of these!

@alexeyr-ci2alexeyr-ci2force-pushed thealexeyr/performance-tests branch 5 times, most recently from040c0a5 toad56e14CompareNovember 5, 2025 15:49
@claude
Copy link

claudebot commentedNov 5, 2025

PR Review: Add Performance Tests (1868)

Summary

This PR adds comprehensive performance testing infrastructure using three industry-standard benchmarking tools (Fortio, Vegeta, and k6). Well-structured implementation addressing issue 1777.

Critical Issues

1. Missing Trailing Newlines

CRITICAL per CLAUDE.md: Files MUST end with newline character or CI will fail:

  • spec/dummy/bin/prod (line 4)
  • spec/dummy/bin/prod-assets (line 9)
  • spec/performance/bench.sh (line 202)

2. RuboCop Required

CRITICAL per CLAUDE.md: Must runbundle exec rubocop and fix ALL violations before commit.

Potential Bugs

K6 JSON Parsing (bench.sh:187-192)

The jq expression usesadd which fails on empty arrays. Useadd // 0 for safety.

Server Startup Race (benchmark.yml:254)

Usesexit 0 inside loop which exits the entire step. Should usebreak instead.

npm vs yarn (benchmark.yml:200)

Usesnpm install --global yalc but CLAUDE.md requires yarn-only commands.

Code Quality Strengths

  1. Excellent tool selection - three benchmarking tools for cross-validation
  2. Well-structured workflow with clear sections
  3. Good error handling with set -euo pipefail
  4. Comprehensive parameterization via workflow inputs
  5. Smart binary caching for faster CI
  6. Security-conscious SSH access gating

Suggestions

  1. Shell portability: Replace bc with awk for float comparisons
  2. Hardcoded route: Make server_side_hello_world_hooks configurable
  3. Duration: 30s may be too short, consider 60s for stable measurements
  4. Add documentation for local benchmark execution
  5. Consider performance regression detection vs master
  6. Add spec/performance/README.md with interpretation guide

Test Coverage

Good: Infrastructure in place, runs on PR/push, results preserved as artifacts
Missing: No bash script tests, no performance thresholds, no local testing docs

Verdict

Solid PR with high code quality. Performance testing infrastructure will be valuable.

Required Before Merge:

  1. Fix trailing newlines (will cause CI failure)
  2. Run bundle exec rubocop and fix violations
  3. Fix K6 empty array bug

Recommended:

  1. Add basic documentation
  2. Verify workflow passes on CI
  3. Consider CHANGELOG entry

Great work overall!

@alexeyr-ci2alexeyr-ci2force-pushed thealexeyr/performance-tests branch 2 times, most recently frome350a34 to7b9d621CompareNovember 6, 2025 10:25
@claude
Copy link

claudebot commentedNov 6, 2025

PR Review: Add Performance Tests

Thank you for this comprehensive performance testing implementation! This is a well-structured addition to the repository. Here's my detailed feedback:


Strengths

  1. Comprehensive Tool Coverage: Great choice using three different load testing tools (Fortio, Vegeta, k6) to get diverse perspectives on performance
  2. Excellent Error Handling: The bash script has robust parameter validation and error handling throughout
  3. Smart Caching: GitHub Actions workflow efficiently caches binaries to speed up future runs
  4. Good Documentation: Clear comments, parameter descriptions, and helpful echo statements
  5. Configurable Parameters: Extensive workflow_dispatch inputs allow flexible testing scenarios
  6. Production-Ready Setup: Proper Puma configuration with workers and preloading for production benchmarks

🐛Potential Issues

1. Script Portability - bc Dependency (bench.sh:21,33)

The script uses bc for floating-point comparisons but bc is not installed in the workflow.

Fix: Add bc to the apt package installation step in the workflow, or use bash-native arithmetic for integer comparisons.

2. Race Condition in Server Startup (benchmark.yml:248)

The server process is backgrounded but there is no guarantee it has started before the polling loop begins. If the server fails to start immediately, the loop might miss error messages.

Suggestion: Add a brief sleep 2 before the polling loop, or capture the PID and check if the process is still running during the timeout loop.

3. k6 Status Reporting Edge Case (bench.sh:199-204)

If .root_group.checks is empty, add returns null, causing K6_REQS_OTHER calculation to fail.

Fix: Add null handling by changing the jq expression to use | add // 0

4. Missing Newline at End of Files

Per CLAUDE.md requirements: ALWAYS ensure files end with a newline character

Files needing newlines:

  • spec/dummy/bin/prod (line 4)
  • spec/dummy/bin/prod-assets (line 9)
  • spec/performance/bench.sh (line 214)

Fix: Run rake autofix or add trailing newlines manually.

5. Benchmark Workflow Runs on Every PR (benchmark.yml:57-58)

Running 30-second benchmarks with 3 tools on every PR will consume significant CI minutes and slow down PR feedback.

Suggestion:

  • Remove pull_request trigger and only keep workflow_dispatch + push to master
  • Or add a condition to only run on PRs with a specific label (e.g., performance)
  • Or significantly reduce default duration (e.g., 10s) for PR runs

🔍Code Quality & Best Practices

Security

  • ✅ SSH access is properly gated behind workflow_dispatch input with clear warnings
  • ✅ limit-access-to-actor: true prevents unauthorized access
  • ⚠️ Consider adding a comment warning maintainers never to merge code while SSH session is active

Performance

  • ✅ Binary caching will significantly speed up repeated runs
  • ✅ Server warm-up phase (10 requests) is good practice
  • ⚠️ REQUEST_TIMEOUT=60s is very generous - consider 30s default to catch timeout issues faster

Maintainability

  • ✅ Clear section comments in YAML
  • ✅ Parameterized configuration
  • ⚠️ Consider extracting the jq parsing logic (lines 167-209) into a separate function for readability

📊Test Coverage

Missing:

  • ❌ No automated tests for the bench.sh script itself
  • ❌ No validation that the summary.txt format is correct
  • ❌ No performance regression detection (just data collection)

Suggestions:

  1. Add a unit test for bench.sh parameter validation
  2. Consider storing benchmark results over time and comparing against baselines
  3. Add a step to post benchmark results as a PR comment for visibility

🔧Configuration Issues

Puma Configuration (spec/dummy/config/puma.rb:38)

Workflow sets default to 4 workers (WEB_CONCURRENCY: 4), but Puma defaults to 2 if not set. The workflow default should match.

Consistency check needed: Verify that WEB_CONCURRENCY=4 is intentional for CI runners with 2 CPU cores.


📝Documentation & Changelog

Per CLAUDE.md: Update CHANGELOG.md for user-visible changes only (features, bug fixes, breaking changes, deprecations, performance improvements)

This PR qualifies as it adds a performance testing feature. Needs:

  • ✅ Changelog entry format:PR 1868 byalexeyr-ci2: Add performance benchmarking workflow with Fortio, Vegeta, and k6.

Missing:

  • ❌ No README/docs update explaining how to run benchmarks locally
  • ❌ No documentation on interpreting benchmark results

🎯Recommendations

Before Merge (Critical):

  1. ✅ Add trailing newlines to all new files
  2. ✅ Run bundle exec rubocop and fix any violations
  3. ✅ Add CHANGELOG entry
  4. ✅ Fix k6 null handling edge case
  5. ✅ Reconsider pull_request trigger or reduce duration

Nice to Have:
6. Add bc to dependencies or remove floating-point support
7. Add brief README section on running benchmarks
8. Handle server startup race condition
9. Consider performance regression detection


🎓Overall Assessment

This is asolid, production-ready implementation of performance testing. The code quality is high, with excellent error handling and configurability. The main concerns are:

  1. CI cost - running on every PR may be expensive
  2. Missing documentation - users won't know how to use this
  3. File formatting - needs trailing newlines per repo standards

Recommendation: Request changes for the critical items (newlines, changelog, PR trigger), then approve.


Score: 8/10 - Great implementation, minor issues to address.

@alexeyr-ci2alexeyr-ci2force-pushed thealexeyr/performance-tests branch from7b9d621 tof2b9ef8CompareNovember 7, 2025 13:13
@claude
Copy link

claudebot commentedNov 7, 2025

Pull Request Review: Add Performance Tests

Thank you for this comprehensive performance testing implementation! This is a valuable addition to the project. Below is my detailed review:


Strengths

  1. Multi-tool approach: Using Fortio, Vegeta, and K6 provides cross-validation of results and flexibility for different testing scenarios.

  2. Well-structured workflow: The GitHub Actions workflow is logically organized with clear steps and helpful comments.

  3. Configurable parameters: Extensive configurability via workflow inputs allows fine-tuning for different performance scenarios.

  4. Good error handling: Thebench.sh script includes comprehensive input validation and error messages.

  5. Caching: Binary caching for tools reduces workflow execution time.

  6. Production mode testing: Tests against production-like configuration (Puma workers, precompiled assets) for realistic results.


🔍Issues and Concerns

Critical Issues

  1. Missing file newlines (spec/dummy/bin/prod,spec/dummy/bin/prod-assets)

    • Location: spec/dummy/bin/prod:4, spec/dummy/bin/prod-assets:9
    • Impact: CI will fail per CLAUDE.md requirements
    • Fix: Ensure both files end with a newline character
    • Reference: CLAUDE.md states "ALWAYS ensure files end with a newline character"
  2. RuboCop not run

    • Impact: MANDATORY requirement before commits
    • Required action: Runbundle exec rubocop and fix ALL violations
    • Reference: CLAUDE.md: "BEFORE EVERY COMMIT/PUSH: ALWAYS runbundle exec rubocop and fix ALL violations"

High Priority Issues

  1. Workflow runs on every PR/push to master (benchmark.yml:55-58)

    • Impact: Performance tests are resource-intensive and will slow down CI significantly
    • Recommendation: Only run on manual workflow_dispatch, specific labels (e.g.,run-benchmarks), or scheduled runs (e.g., nightly)
  2. Server process not properly cleaned up (benchmark.yml:248)

    • Issue: Server started in background withbin/prod & but never explicitly killed
    • Impact: May leave orphaned processes, especially if workflow fails
    • Fix: Add cleanup step or use a trap/signal handler

Medium Priority Issues

  1. Hardcoded route in bench.sh (line 6)

    • Issue:ROUTE:-server_side_hello_world_hooks may not exist or may not be representative
    • Recommendation: Verify this route exists in the dummy app, consider testing multiple routes, and document what this route tests
  2. No baseline comparison

    • Issue: Results are collected but not compared against previous runs
    • Enhancement: Consider using benchmark-action to track performance over time
  3. Missing documentation

    • Issue: No README or documentation for how to run benchmarks locally, interpret results, or understand what "good" performance looks like
    • Recommendation: Addspec/performance/README.md

🔒Security Considerations

  1. SSH access via tmate (benchmark.yml:108-113)
    • Status: Properly gated behind workflow input (default: false)
    • Good: Includes security warnings andlimit-access-to-actor

🧪Test Coverage

Missing:

  • No unit tests for the bash script logic
  • No tests verifying the benchmark workflow succeeds
  • No validation that summary output format is correct

Recommendations:

  • Add a dry-run mode tobench.sh for testing
  • Consider shellcheck for bash script validation

📝Suggested Changes

Must Fix (Blocks Merge)

  • Add trailing newlines tospec/dummy/bin/prod andspec/dummy/bin/prod-assets
  • Runbundle exec rubocop and fix all violations
  • Change workflow triggers to avoid running on every PR/push

Should Fix (High Priority)

  • Add server cleanup/shutdown in workflow
  • Verify the benchmark route exists and is representative
  • Add basic documentation (spec/performance/README.md)

Nice to Have

  • Add baseline comparison/tracking
  • Add shellcheck to CI for bash scripts
  • Consider shorter default duration for PR checks
  • Extract K6 test to separate file

Changelog

Per CLAUDE.md, thisshould be added to CHANGELOG.md as it's a new feature.


🎯Overall Assessment

This is asolid foundation for performance testing with good multi-tool coverage and configurability. The main blockers are CLAUDE.md compliance issues (trailing newlines, RuboCop). Once those are fixed and the workflow triggers are adjusted to avoid running on every PR, this will be ready to merge.

Estimated effort to address critical issues: ~30 minutes

Great work overall! 🚀

@alexeyr-ci2alexeyr-ci2force-pushed thealexeyr/performance-tests branch fromc58f480 to43f8858CompareNovember 7, 2025 13:36
@claude
Copy link

claudebot commentedNov 7, 2025

Code Review: Add Performance Tests

Thank you for this comprehensive performance testing addition! This is a well-structured PR that addresses issue#1777.


✅ Strengths

1. Excellent Tool Coverage

  • Multiple benchmarking tools (Fortio, Vegeta, k6) provide good cross-validation
  • Smart caching strategy for tool binaries reduces CI time
  • Flexible configuration via environment variables

2. Robust Error Handling

  • Good use of set -euo pipefail in bash scripts
  • Comprehensive input validation with clear error messages
  • Server health checks with timeout protection

3. Production-Ready Configuration

  • Proper Puma clustering setup for production benchmarks
  • Preloading and worker configuration aligned with best practices
  • Appropriate separation of prod/dev configurations

🔒 Security Concerns - CRITICAL

SSH Access Feature (lines 96-119 in .github/workflows/benchmark.yml)

This posessignificant security risks:

Issues:

  1. Detached mode risk: Using detached: true leaves the SSH session open indefinitely
  2. Repository exposure: The entire repository is accessible via SSH
  3. No audit trail: Limited visibility into commands executed during SSH sessions
  4. Secrets exposure risk: If secrets are added later, they could be exposed

Recommendations:

  • Consider removing this feature entirely for production workflows
  • If kept: Remove detached: true, add timeout-minutes: 15, and restrict to non-master branches
  • Document that NO secrets should ever be added to this workflow

🐛 Potential Bugs

1. Division by Zero Risk (spec/performance/bench.sh:195-197)

If K6_REQS_KNOWN_STATUS is null/empty, this could fail. Add validation.

2. Race Condition in Server Startup (spec/dummy/bin/prod)

The rails s command starts but there is no guarantee assets are fully loaded. Consider adding a readiness check that validates asset serving.

3. Missing Error Context

When benchmarks fail, the error message does not indicate which tool failed. Consider wrapping each tool execution with proper error context.

4. Hardcoded Paths (spec/performance/bench.sh:87)

If run from wrong directory, this could create directories in unexpected locations. Consider using SCRIPT_DIR to make paths relative to the script location.


⚡ Performance Considerations

  1. Workflow runs on every push/PR: For a 20-minute benchmark suite, consider only running on workflow_dispatch and push to master, or add path filters to skip documentation changes.

  2. Sequential warmup (spec/performance/bench.sh:73-78): Consider parallel warmup for efficiency.

  3. Triple tool execution: Running all 3 tools provides validation but triples execution time. Consider defaulting to k6 only.


🧪 Test Coverage - Missing

  1. Unit tests for benchmark script: Input validation and JSON parsing logic should be tested
  2. Workflow validation: Consider adding actionlint to CI
  3. Integration test: Should verify benchmark artifacts are generated correctly

📝 Documentation - REQUIRED

Missing per CLAUDE.md requirements:

  1. CHANGELOG.md update - This is a user-visible feature that developers will use. Action needed: Update CHANGELOG.md

  2. Inline documentation: spec/performance/bench.sh needs header comment explaining:

    • Purpose
    • Usage examples
    • Required dependencies
    • Expected output format
  3. Workflow documentation: Add comment block in .github/workflows/benchmark.yml explaining when to run benchmarks and how to interpret results


🎨 Code Quality Suggestions

Good Practices:

  • Consistent error handling
  • Input validation with clear messages
  • Proper use of environment variables with defaults
  • JSON output for machine-readable results

Suggestions:

  1. YAML Anchors: Lines 7-14 define anchors but only use them once - consider simplifying
  2. DRY Principle: Lines 165-208 have very similar blocks for each tool - consider refactoring
  3. Magic Numbers: Several hardcoded values should be configurable (TIMEOUT_SEC=60, warmup count, sleep durations)
  4. Puma Configuration: Default of 2 workers may be too low - document why this was chosen

📋 Action Items Summary

🔴 High Priority (Before Merge):

  1. Address SSH security concerns - Remove or significantly restrict
  2. Add CHANGELOG.md entry (required per project guidelines)
  3. Fix potential null reference in K6 status parsing
  4. Add workflow path filters to prevent unnecessary runs

🟡 Medium Priority (Before Merge):

  1. Add header documentation to bench.sh
  2. Add error context to benchmark failures
  3. Fix hardcoded paths in bash script

🟢 Low Priority (Future Enhancement):

  1. Add unit tests for benchmark script
  2. Refactor duplicate code in parsing logic
  3. Make magic numbers configurable

✨ Overall Assessment

This is asolid foundation for performance testing! The multi-tool approach and comprehensive workflow setup show good engineering practices.

Main concerns:

  1. Security: SSH feature needs significant changes or removal
  2. Documentation: Missing CHANGELOG and inline docs (required per CLAUDE.md)
  3. Robustness: A few edge cases in error handling

Recommendation: Request changes for security and documentation issues, then approve after those are addressed.

Great work on this comprehensive testing infrastructure!

@alexeyr-ci2alexeyr-ci2force-pushed thealexeyr/performance-tests branch 8 times, most recently frombe4af9e to564b8b6CompareNovember 7, 2025 20:20
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Add performance tests

3 participants

@alexeyr-ci2@alexeyr

[8]ページ先頭

©2009-2025 Movatter.jp