Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline state dump and load#352

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
NathalieCharbel wants to merge21 commits intoneo4j:main
base:main
Choose a base branch
Loading
fromNathalieCharbel:pipeline-state-dump-and-load

Conversation

NathalieCharbel
Copy link
Contributor

@NathalieCharbelNathalieCharbel commentedJun 4, 2025
edited
Loading

Description

This PR introduces state management capabilities to thePipeline class, enabling:

  • Checkpointing pipeline execution at specific components usingpipeline.run(..., until='component_x',...)`
  • Resuming pipeline execution from saved states usingpipeline.run(..., from_='component_y',...)`
  • Dumping/loading pipeline state usingdump_state(pipeline_run_id) andload_state(state).

The state includes results from previous runs. This feature is particularly useful for:

  • Debugging long-running pipelines
  • Recovering from failures
  • Comparing component implementations with deterministic inputs

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Documentation update
  • Project configuration change

Complexity

Complexity: high

How Has This Been Tested?

  • Unit tests
  • E2E tests
  • Manual tests

Checklist

The following requirements should have been met (depending on the changes in the branch):

  • Documentation has been updated
  • Unit tests have been updated
  • E2E tests have been updated
  • Examples have been updated
  • New files have copyright header
  • CLA (https://neo4j.com/developer/cla/) has been signed
  • CHANGELOG.md updated if appropriate

@NathalieCharbelNathalieCharbel requested a review froma team as acode ownerJune 4, 2025 10:18
keys_to_remove = [
key for key in self._data.keys() if key.startswith(run_id_prefix)
]
for key in keys_to_remove:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

So here we are removing all results from a previous run with this run_id, right?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

yes!

@@ -140,6 +140,7 @@ def __init__(
}
"""
self.missing_inputs: dict[str, list[str]] = defaultdict()
self._current_run_id: Optional[str] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This can not be saved in the Pipeline instance, since concurrent runs will override it.

Copy link
ContributorAuthor

@NathalieCharbelNathalieCharbelJun 12, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I will move it back todump() function. I think we should maintain creating differentrun_ids even after resuming the same pipeline and dump the state based on previous ones. we could keep track ofrun_ids of the same pipeline in the state. This should resolve the concurrency issue, right?

@NathalieCharbelNathalieCharbel marked this pull request as draftJune 16, 2025 14:22
@NathalieCharbelNathalieCharbel marked this pull request as draftJune 16, 2025 14:22
@NathalieCharbelNathalieCharbel marked this pull request as draftJune 16, 2025 14:22
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@stellasiastellasiastellasia left review comments

At least 1 approving review is required to merge this pull request.

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

2 participants
@NathalieCharbel@stellasia

[8]ページ先頭

©2009-2025 Movatter.jp