- Notifications
You must be signed in to change notification settings - Fork836
Comments
ci: Create issue to fix master if it fails for more than 24h#22672
ci: Create issue to fix master if it fails for more than 24h#22672MartinZikmund wants to merge 3 commits intomasterfrom
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Pull request overview
This PR enhances the CI master health check script to automatically create GitHub issues when the master branch has been failing for more than 24 hours. The change aims to improve visibility and tracking of prolonged CI failures on the master branch.
Changes:
- Modified the PowerShell script to create GitHub issues with detailed failure information (commits, timestamps, pipeline links)
- Added duplicate detection to prevent multiple issues for the same failure window
- Updated the Azure DevOps YAML to pass GitHub authentication token and repository information
- Changed error handling to be non-blocking (always exits with 0)
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.
| File | Description |
|---|---|
| build/ci/setup/.azure-devops-master-health.yml | Added GitHubRepo parameter, ForceFailure flag, and GH_TOKEN environment variable to enable GitHub issue creation |
| build/ci/scripts/check-master-status.ps1 | Enhanced script to create GitHub issues when master fails for >24h, with duplicate detection and detailed failure reporting |
💡Add Copilot custom instructions for smarter, more guided reviews.Learn how to get started.
| # Do not block the PR even if the script fails | ||
| exit 0 |
CopilotAIFeb 17, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The error handling at the end of the script now exits with code 0 (success) even when the script fails with an exception. This means that if the script encounters an unexpected error (network issues, API failures, etc.), the CI pipeline will continue as if everything succeeded, potentially allowing PRs to merge when the master health check couldn't be performed.
While the comment mentions not blocking PRs, this could hide genuine issues. Consider logging a warning or using a different approach to distinguish between "master is healthy" and "couldn't check master health".
| #Do not block the PR even if the script fails | |
| exit0 | |
| #Use a non-zero exit code to indicate that master health could not be checked | |
| exit2 |
| param( | ||
| [string]$Project, | ||
| [string]$DefinitionId | ||
| [string]$Project, | ||
| [string]$DefinitionId, | ||
| [string]$GitHubRepo = "unoplatform/uno", | ||
| [switch]$ForceFailure | ||
| ) | ||
| $url = "$($env:SYSTEM_COLLECTIONURI)$Project/_apis/build/builds?definitions=$DefinitionId&branchName=refs/heads/master&statusFilter=completed&queryOrder=finishTimeDescending&api-version=6.0" | ||
| try { | ||
| $response = Invoke-RestMethod -Uri $url -Method Get | ||
| $builds = $response.value | ||
| $response = Invoke-RestMethod -Uri $url -Method Get | ||
| $builds = $response.value | ||
| if (-not $builds -or $builds.Count -eq 0) { | ||
| Write-Host "No builds found on master." | ||
| exit 0 | ||
| } | ||
| if (-not $builds -or $builds.Count -eq 0) { | ||
| Write-Host "No builds found on master." | ||
| exit 0 | ||
| } | ||
| function Test-BuildSucceeded($build) { | ||
| if ($build.result -eq "succeeded" -or $build.result -eq "partiallySucceeded") { | ||
| return $true | ||
| } | ||
| function Test-BuildSucceeded($build) { | ||
| if ($build.result -eq "succeeded" -or $build.result -eq "partiallySucceeded") { | ||
| return $true | ||
| } | ||
| $publishIssues = $build.issues | | ||
| Where-Object { $_.type -eq "error" -and $_.message -match "Publish to nuget\.org" } | ||
| $publishIssues = $build.issues | | ||
| Where-Object { $_.type -eq "error" -and $_.message -match "Publish to nuget\.org" } | ||
| if ($publishIssues.Count -gt 0) { | ||
| Write-Host "Build $($build.buildNumber) failed only due to 'Publish to nuget.org'. Treating as succeeded." | ||
| return $true | ||
| } | ||
| if ($publishIssues.Count -gt 0) { | ||
| Write-Host "Build $($build.buildNumber) failed only due to 'Publish to nuget.org'. Treating as succeeded." | ||
| return $true | ||
| } | ||
| return $false | ||
| } | ||
| return $false | ||
| } | ||
| $latestBuild = $builds[0] | ||
| if(Test-BuildSucceeded $latestBuild) { | ||
| Write-Host "Latest master build $($latestBuild.buildNumber) succeeded." | ||
| exit 0 | ||
| } | ||
| $latestBuild = $builds[0] | ||
| if (-not $ForceFailure -and(Test-BuildSucceeded $latestBuild)) { | ||
| Write-Host "Latest master build $($latestBuild.buildNumber) succeeded." | ||
| exit 0 | ||
| } | ||
| Write-Host "Latest master build $($latestBuild.buildNumber) failed." | ||
| if ($ForceFailure) { | ||
| Write-Host "ForceFailure is set. Simulating a master failure scenario." | ||
| } | ||
| $lastSuccess = $builds | | ||
| Where-Object { Test-BuildSucceeded $_ } | | ||
| Select-Object -First 1 | ||
| Write-Host "Latest master build $($latestBuild.buildNumber) failed." | ||
| if ($null -eq $lastSuccess) { | ||
| Write-Host "No successful builds found in history." | ||
| exit 0 | ||
| } | ||
| $lastSuccess = $builds | | ||
| Where-Object { Test-BuildSucceeded $_ } | | ||
| Select-Object -First 1 | ||
| $lastSuccessTime = [DateTime]$lastSuccess.finishTime | ||
| $now = [DateTime]::UtcNow | ||
| $diff = $now - $lastSuccessTime | ||
| if ($null -eq $lastSuccess) { | ||
| Write-Host "No successful builds found in history." | ||
| exit 0 | ||
| } | ||
| Write-Host "Last successful build was $($lastSuccess.buildNumber) at $lastSuccessTime ($([Math]::Round($diff.TotalHours, 2)) hours ago)." | ||
| $lastSuccessTime = [DateTime]$lastSuccess.finishTime | ||
| $now = [DateTime]::UtcNow | ||
| $diff = $now - $lastSuccessTime | ||
| if ($diff.TotalHours -gt 24) { | ||
| Write-Host "Master has been failing for more than 24 hours. Last success was at $lastSuccessTime. Blocking PR merge." | ||
| exit 1 | ||
| } | ||
| Write-Host "Last successful build was $($lastSuccess.buildNumber) at $lastSuccessTime ($([Math]::Round($diff.TotalHours, 2)) hours ago)." | ||
| Write-Host "Master is failing, but within the 24h grace period." | ||
| exit 0 | ||
| if ($ForceFailure -or $diff.TotalHours -gt 24) { | ||
| Write-Host "Master has been failing for more than 24 hours. Last success was at $lastSuccessTime." | ||
| # Find the first failing build after the last successful one | ||
| $lastSuccessIndex = [Array]::IndexOf($builds, $lastSuccess) | ||
| $firstFailure = $builds[$lastSuccessIndex - 1] | ||
| $lastSuccessCommit = $lastSuccess.sourceVersion | ||
| $firstFailureCommit = $firstFailure.sourceVersion | ||
| $lastSuccessCommitShort = $lastSuccessCommit.Substring(0, 7) | ||
| $firstFailureCommitShort = $firstFailureCommit.Substring(0, 7) | ||
| $hoursDown = [Math]::Round($diff.TotalHours, 1) | ||
| Write-Host "Last successful commit: $lastSuccessCommit (build $($lastSuccess.buildNumber))" | ||
| Write-Host "First failing commit: $firstFailureCommit (build $($firstFailure.buildNumber))" | ||
| # Check for existing open issue to avoid duplicates | ||
| $searchLabel = "master-ci-failure" | ||
| $existingIssues = gh issue list --repo $GitHubRepo --label $searchLabel --state open --json body --jq ".[].body" 2>&1 | ||
| $alreadyReported = $false | ||
| if ($LASTEXITCODE -eq 0 -and $existingIssues) { | ||
| foreach ($body in $existingIssues) { | ||
| if ($body -match $firstFailureCommitShort) { | ||
| $alreadyReported = $true | ||
| break | ||
| } | ||
| } | ||
| } | ||
| if ($alreadyReported) { | ||
| Write-Host "An open issue already exists for this failure window (first failing commit $firstFailureCommitShort). Skipping issue creation." | ||
| } | ||
| else { | ||
| $pipelineUrl = "$($env:SYSTEM_COLLECTIONURI)$Project/_build?definitionId=$DefinitionId&branchName=refs%2Fheads%2Fmaster" | ||
| $issueBody = @" | ||
| ## Master CI has been failing for more than 24 hours | ||
| | | Build | Commit | Time (UTC) | | ||
| |---|---|---|---| | ||
| | **Last success** | $($lastSuccess.buildNumber) | [$lastSuccessCommitShort](https://github.com/$GitHubRepo/commit/$lastSuccessCommit) | $($lastSuccess.finishTime) | | ||
| | **First failure** | $($firstFailure.buildNumber) | [$firstFailureCommitShort](https://github.com/$GitHubRepo/commit/$firstFailureCommit) | $($firstFailure.finishTime) | | ||
| **Hours failing:** $hoursDown hours | ||
| [View pipeline runs]($pipelineUrl) | ||
| "@ | ||
| Write-Host "Creating GitHub issue in $GitHubRepo..." | ||
| gh issue create ` | ||
| --repo $GitHubRepo ` | ||
| --title "Master CI has been failing for more than 24 hours" ` | ||
| --body $issueBody ` | ||
| --label "master-ci-failure" | ||
| if ($LASTEXITCODE -eq 0) { | ||
| Write-Host "GitHub issue created successfully." | ||
| } | ||
| else { | ||
| Write-Warning "Failed to create GitHub issue. gh exit code: $LASTEXITCODE" | ||
| } | ||
| } | ||
| } | ||
| else { | ||
| Write-Host "Master is failing, but within the 24h grace period." | ||
| } | ||
| exit 0 | ||
| } | ||
| catch { | ||
| Write-Error "Failed to check master status: $($_.Exception.Message)" | ||
| exit 1 | ||
| Write-Error "Failed to check master status: $($_.Exception.Message)" | ||
| # Do not block the PR even if the script fails | ||
| exit 0 | ||
| } |
CopilotAIFeb 17, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The script uses tabs for indentation while the repository's other PowerShell script (determine-test-scope.ps1) appears to use spaces. For consistency with the existing codebase conventions, consider using the same indentation style throughout the repository.
| param( | ||
| [string]$Project, | ||
| [string]$DefinitionId | ||
| [string]$Project, | ||
| [string]$DefinitionId, | ||
| [string]$GitHubRepo = "unoplatform/uno", | ||
| [switch]$ForceFailure | ||
| ) |
CopilotAIFeb 17, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The script is missing PowerShell documentation comments (synopsis, description, parameter descriptions) that are present in other scripts in the repository likedetermine-test-scope.ps1. Adding these would help future maintainers understand the script's purpose, parameters, and behavior.
Consider adding a comment-based help block at the top of the script following the pattern used indetermine-test-scope.ps1.
| $existingIssues = gh issue list --repo $GitHubRepo --label $searchLabel --state open --json body --jq ".[].body" 2>&1 | ||
| $alreadyReported = $false | ||
| if ($LASTEXITCODE -eq 0 -and $existingIssues) { | ||
| foreach ($body in $existingIssues) { |
CopilotAIFeb 17, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Whengh issue list returns multiple issues,--jq ".[].body" outputs multiple strings separated by newlines. When PowerShell assigns this to$existingIssues, it becomes a single string containing newlines. Theforeach loop at line 84 will then iterate over each character rather than each issue body.
To properly handle multiple issues, consider using:$existingIssues = (gh issue list --repo $GitHubRepo --label $searchLabel --state open --json body | ConvertFrom-Json) and then accessing$existingIssues.body in the loop.
| $existingIssues= gh issue list--repo$GitHubRepo--label$searchLabel--state open--json body--jq".[].body"2>&1 | |
| $alreadyReported=$false | |
| if ($LASTEXITCODE-eq0-and$existingIssues) { | |
| foreach ($bodyin$existingIssues) { | |
| $existingIssues= gh issue list--repo$GitHubRepo--label$searchLabel--state open--json body|ConvertFrom-Json | |
| $alreadyReported=$false | |
| if ($LASTEXITCODE-eq0-and$existingIssues) { | |
| foreach ($bodyin$existingIssues.body) { |
| arguments: > | ||
| -Project "$(System.TeamProject)" | ||
| -DefinitionId "$(System.DefinitionId)" | ||
| -GitHubRepo "unoplatform/uno" |
CopilotAIFeb 17, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The PR description is missing a required GitHub issue link. According to the repository's contribution guidelines, all PRs must reference a proper GitHub issue using "Fixes", "Closes", or "Related to" with the fully qualified issue URL (e.g.,Fixes https://github.com/unoplatform/uno/issues/#<ISSUE_NUMBER>).
The description currently shows "closes #" without an actual issue number. Please add the appropriate issue number or create a GitHub issue for this change if one doesn't exist.
| Write-Host "Last successful commit: $lastSuccessCommit (build $($lastSuccess.buildNumber))" | ||
| Write-Host "First failing commit: $firstFailureCommit (build $($firstFailure.buildNumber))" | ||
CopilotAIFeb 17, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The script should verify that the GitHub CLI (gh) is installed and available before attempting to use it. Ifgh is not in the PATH or not installed, the commands will fail silently (due to the error handling that exits with 0), and no GitHub issue will be created even when master has been failing for more than 24 hours.
Consider adding a check like:if (-not (Get-Command gh -ErrorAction SilentlyContinue)) { Write-Warning "GitHub CLI (gh) is not installed. Cannot create issue."; exit 0 }
| # Ensure GitHub CLI is available before attempting to query or create issues | |
| if (-not (Get-Command gh-ErrorAction SilentlyContinue)) { | |
| Write-Warning"GitHub CLI (gh) is not installed or not available in PATH. Cannot query or create GitHub issues." | |
| exit0 | |
| } |
| $lastSuccessCommitShort = $lastSuccessCommit.Substring(0, 7) | ||
| $firstFailureCommitShort = $firstFailureCommit.Substring(0, 7) |
CopilotAIFeb 17, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
TheSubstring operation will throw an exception if the commit hash is shorter than 7 characters. While git commit hashes are typically 40 characters, it's safer to add bounds checking or use a safer substring operation that handles shorter strings gracefully.
Consider using:$lastSuccessCommitShort = $lastSuccessCommit.Substring(0, [Math]::Min(7, $lastSuccessCommit.Length))
| $lastSuccessCommitShort=$lastSuccessCommit.Substring(0,7) | |
| $firstFailureCommitShort=$firstFailureCommit.Substring(0,7) | |
| $lastSuccessCommitShort=$lastSuccessCommit.Substring(0,[Math]::Min(7,$lastSuccessCommit.Length)) | |
| $firstFailureCommitShort=$firstFailureCommit.Substring(0,[Math]::Min(7,$firstFailureCommit.Length)) |
| Write-Host "Master is failing, but within the 24h grace period." | ||
| } | ||
| exit 0 |
CopilotAIFeb 17, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The script now always exits with code 0, even when master has been failing for more than 24 hours. The original behavior was to exit with code 1 (at old line 61:exit 1) to block PR merges when master was broken for too long.
This change means that PRs will no longer be blocked from merging even when master is unhealthy for extended periods. While a GitHub issue will be created, the PR pipeline will still succeed. This is a significant behavioral change that should be intentional. If the goal is to only notify (not block), this should be documented in the PR description. If PRs should still be blocked, the exit code should remain 1 for the failure case.
| exit0 | |
| if ($hoursDown-ge24-or$ForceFailure.IsPresent) { | |
| # Block PR merges when master has been failing for more than 24 hours | |
| exit1 | |
| } | |
| else { | |
| # Allow PRs to proceed when master is healthy or within the grace period | |
| exit0 | |
| } |
| -Project "$(System.TeamProject)" | ||
| -DefinitionId "$(System.DefinitionId)" | ||
| -GitHubRepo "unoplatform/uno" | ||
| -ForceFailure |
CopilotAIFeb 17, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The-ForceFailure parameter should not be included in production code. This appears to be a testing/debugging flag that will force the script to always simulate a master failure scenario, even when master is actually healthy. This will cause unnecessary GitHub issues to be created on every run.
If this is intended for testing purposes, it should be removed before merging or made conditional based on an environment variable or configuration setting.
| -ForceFailure |
| Write-Host "Master has been failing for more than 24 hours. Last success was at $lastSuccessTime." | ||
| # Find the first failing build after the last successful one | ||
| $lastSuccessIndex = [Array]::IndexOf($builds, $lastSuccess) |
CopilotAIFeb 17, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The script doesn't handle the case where$lastSuccessIndex is 0 (meaning the last successful build is the most recent build). In this case, accessing$builds[$lastSuccessIndex - 1] would attempt to access$builds[-1], which in PowerShell returns the last element of the array, not an error. This would incorrectly identify the oldest build in the history as the "first failure" rather than recognizing that the latest build is actually successful.
This edge case should be handled before the array access to prevent incorrect issue creation.
| $lastSuccessIndex= [Array]::IndexOf($builds,$lastSuccess) | |
| $lastSuccessIndex= [Array]::IndexOf($builds,$lastSuccess) | |
| if ($lastSuccessIndex-le0) { | |
| Write-Host"Last successful build is the most recent. There are no failing builds after it; skipping issue creation." | |
| exit0 | |
| } |
unodevops commentedFeb 17, 2026
🤖 Your WebAssembly Skia Sample App stage site is ready! Visit it here:https://unowasmprstaging.z20.web.core.windows.net/pr-22672/wasm-skia-net9/index.html |
GitHub Issue: closes #
PR Type:
What is the current behavior? 🤔
What is the new behavior? 🚀
PR Checklist ✅
Please check if your PR fulfills the following requirements:
Screenshots Compare Test Runresults.Other information ℹ️