- Notifications
You must be signed in to change notification settings - Fork2k
fix: crash child task runs when parent flow run crashes#19600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Draft
zzstoatzz wants to merge2 commits intomainChoose a base branch fromfix/crash-child-task-runs-19594
base:main
Could not load branches
Branch not found:{{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline, and old review comments may become outdated.
+388 −1
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
When a flow run transitions to CRASHED state (e.g., OOM killed inKubernetes), automatically transition all non-terminal child task runsto CRASHED as well. This ensures task runs don't remain stuck inRUNNING state when their parent flow crashes.Adds `CrashChildTaskRuns` transform to `GlobalFlowPolicy`, followingthe same pattern as `UpdateSubflowParentTask`.Fixes#19594🤖 Generated with [Claude Code](https://claude.com/claude-code)Co-Authored-By: Claude <noreply@anthropic.com>
codspeed-hqbot commentedDec 2, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
CodSpeed Performance ReportMerging#19600 willnot alter performanceComparing Summary
|
- Add proper docstrings explaining why API-level testing is appropriate- Add test for preserving already-completed tasks- Add test for crashing pending tasks- Clean up test structure and imports🤖 Generated with [Claude Code](https://claude.com/claude-code)Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading.Please reload this page.
Summary
When a flow run transitions to CRASHED state (e.g., OOM killed in Kubernetes), automatically transition all non-terminal child task runs to CRASHED as well.
CrashChildTaskRunstransform toGlobalFlowPolicyUpdateSubflowParentTaskScope and behavior
What this handles
Nested tasks (tasks calling tasks): All tasks within a flow share the same
flow_run_idregardless of nesting depth. Whentask_acallstask_bcallstask_c, they all have the sameflow_run_id. The fix correctly crashes all of them when the parent flow crashes.All non-terminal task states: PENDING, RUNNING, SCHEDULED, PAUSED - any task that hasn't reached a terminal state will be transitioned to CRASHED.
Terminal tasks preserved: Tasks that have already COMPLETED, FAILED, CANCELLED, or CRASHED are not modified.
What this does NOT handle (separate concern)
Subflows: When a flow calls another flow as a subflow, that subflow is a separate flow run with its own
flow_run_id. The subflow's "parent task run" (in the parent flow) WILL be crashed by this fix, but the subflow itself won't be automatically cascaded to CRASHED. This is arguably a separate feature request, as subflows are architecturally independent flow runs that can be retried independently.Test plan
tests/server/orchestration/test_global_policy.py:test_crash_child_task_runs_when_flow_crashestest_does_not_crash_already_terminal_task_runstest_does_not_crash_tasks_for_non_crashed_transitionsintegration-tests/test_crash_child_task_runs.py:test_child_task_runs_crash_when_flow_crashestest_does_not_crash_already_completed_taskstest_crash_propagates_to_pending_tasksFixes#19594
🤖 Generated withClaude Code