Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Adds new working dir upload protocol PLASMA, and use it in job submission.#45880

Draft
rynewang wants to merge1 commit intoray-project:master
base:master
Choose a base branch
Loading
fromrynewang:plasma-protocol

Conversation

rynewang
Copy link
Contributor

When user inits Ray with a working_dir, under the hood we package the user-local directory/zip and upload to the Ray cluster. However we are uploading it to the GCS Internal KV, which may pose unnecessary burden to our global process, who can already be very busy on large clusters.

This PR introduces a new "remote protocol"plasma. It spins up a global singleton detached actorDataHolder, which stores bytes into the Object Store. The package uploader invokes a Ray remote method to store it; the package downloader just do regular ray.get to download.

Problem: this requiresray be initialized in the first place. So when you doray.init(runtime_env={"working_dir":"./"} you introduce a circular dependency between ray and DataHolder, which fails. For similar reasons, Ray Client can hardly do this.

Fortunately, Jobs can do that just fine. When you doray job submit, it actually makes a HTTP PUT call to the dashboard JobAgent, which invokes anything needed to save the package. And there, DataHolder can work.

In this PR:

  • a singleton DataHolder actor
  • a new Protocol.PLASMA type
  • changed all code from defaulting to GCS to require a Protocol type, and work with DataHolder if it's PLASMA.
  • Use the new plasma in JobSubmissionClient & ServeSubmissionClient.

What's not changed:

  • Ray driver script:ray.init() still uses GCS.
  • Ray Client still uses GCS.

In the long run: we can extend Ray driver script & Ray client cases to all use HTTP PUT, and remove GCS code path.

…sions.Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
@rkooo567
Copy link
Contributor

Hmm if we cannot make it work in all cases (e.g., ray.init()), I feel like it may be better just allowing http interface (or allowing s3).

But I feel like there may be ways to make it work with ray.init() because only workers are going to need runtime env

@staleStale
Copy link

stalebot commentedFeb 25, 2025

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

  • If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@stalestalebot added the staleThe issue is stale. It will be closed within 7 days unless there are further conversation labelFeb 25, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees

@jjyaojjyao

Labels
staleThe issue is stale. It will be closed within 7 days unless there are further conversation
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@rynewang@rkooo567@jjyao

[8]ページ先頭

©2009-2025 Movatter.jp