Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

New vsphere provider supporting Supervisor (k8s) cluster.#49881

Open
roshankathawate wants to merge16 commits intoray-project:master
base:master
Choose a base branch
Loading
fromroshankathawate:feature/vmray-provider

Conversation

roshankathawate
Copy link

@roshankathawateroshankathawate commentedJan 16, 2025
edited
Loading

Why are these changes needed?

Before these changes vSphere provider was using vSphere SDK to create a Ray cluster on vSphere. However, with availability of K8s control plane within the vSphere hypervisor (Supervisor) it is possible to deploy services using k8s operator. These changes are done to make Ray as a Supervior service on vSphere. With these changes Ray cluster is exposed as a k8s custom resource (CR). Instead of using old vSphere SDK to create and manage the cluster, now, the provider create, update, and delete Ray cluster CRs through exposed k8s API.

  1. Old provider (vSphere provider) replaced by the new vmray provider.( Updated node_provider.py)
  2. Added cluster_operator_client.py: Wrapper for calling K8s APIs to manage Ray cluster. (Autoscaler through the node provider calls functions in this file to manage a Ray cluster)
  3. Ray nodes are created through VM CR and not using pyvmomi, vsphere SDK (deleted files pyvmomi_sdk_provider.py, vsphere_sdk_provider.py, and associated test files)
  4. Ray nodes are created through VM CR instead of using Frozen VM template ( changed configs (config.py) to support new configurations, deleted redundant gpu_utils.py and utils.py)
  5. Updated default.yaml and other yaml files to support:
    • Supervisor (K8s) Namespace where the Ray cluster should be deployed
    • The Supervisor control plane VM IP (k8s API server)
    • VM image to be used to create Ray nodes
    • VM Storage class to be used for Ray nodes creation.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e.,git commit -s) in this PR.
  • I've runscripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed forhttps://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it indoc/source/tune/api/ under the
      corresponding.rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures athttps://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@roshankathawateroshankathawate marked this pull request as ready for reviewJanuary 18, 2025 04:08
Copy link
Member

@kevin85421kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hi@roshankathawate,

  • Would you mind adding a PR description? I have no context about vSphere, so a PR description would help me better understand this PR.

  • Does it make sense to split this into smaller PRs? I haven't reviewed it yet, but I'm wondering if we can make this large PR easier to review.

  • Do you have any colleagues who could review this PR before I take a look?

Thanks

@kevin85421kevin85421 self-assigned thisJan 31, 2025
@roshankathawate
Copy link
Author

Hi@kevin85421,

Thanks for reviewing the PR. I have tried to provide details in the PR description so you can understand the changes and sorry for not adding those before. I could make smaller PRs but if you see I have only added one new file and most of the other files are either redundant (which are deleted )or are config files. But if you still find it difficult to review let me know and I'll try to make small PRs. Once again thanks for reviewing it.

kevin85421 reacted with thumbs up emoji

@kevin85421
Copy link
Member

Btw, it is not necessary to sync with the master branch if there are no conflicts.

@VamshikShetty
Copy link
Contributor

Hey@roshankathawate and@kevin85421
I have broke down this MR and raised the first part [#50936] .i.e cleanup of old provider.
Can you please take a look, thanks !

@jcotant1jcotant1 added the coreIssues that should be addressed in Ray Core labelMar 26, 2025
@hainesmichaelchainesmichaelc added the community-contributionContributed by the community labelApr 4, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@kevin85421kevin85421kevin85421 left review comments

@hongchaodenghongchaodengAwaiting requested review from hongchaodeng

At least 1 approving review is required to merge this pull request.

Assignees

@kevin85421kevin85421

Labels
community-contributionContributed by the communitycoreIssues that should be addressed in Ray Core
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

6 participants
@roshankathawate@kevin85421@VamshikShetty@hainesmichaelc@jcotant1@ankitasonawane30

[8]ページ先頭

©2009-2025 Movatter.jp