- Notifications
You must be signed in to change notification settings - Fork6.2k
Insights: ray-project/ray
Overview
Could not load contribution data
Please try again later
2 Releases published by2 people
- ray-2.44.0 Ray-2.44.0
published
Mar 21, 2025 - ray-2.44.1 Ray-2.44.1
published
Mar 27, 2025
87 Pull requests merged by33 people
- [data] refine backpressure info on progress bar
#51697 merged
Mar 27, 2025 - [core][state] Fix false alarm in
get_logs
when a server chunk splits into multiple client chunks#51750 merged
Mar 27, 2025 - Revert "[doc] Add hpu resource description in ray train related docs"
#51754 merged
Mar 27, 2025 - Make @edoakes the czar of
_common/
dir for now#51753 merged
Mar 27, 2025 - [Feat][Core/Dashboard] Convert ReportHead to subprocess module
#51733 merged
Mar 27, 2025 - [core] Fix windows build with no cython -Wno-shadow
#51730 merged
Mar 27, 2025 - [data] add getdaft to compiled versions
#51723 merged
Mar 27, 2025 - [serve] Remove RAY_SERVE_EAGERLY_START_REPLACEMENT_REPLICAS flag
#51722 merged
Mar 26, 2025 - Revert "[serve] Log rejected requests at router side (#51346)"
#51698 merged
Mar 26, 2025 - Run basic Python 3.13 tests
#51688 merged
Mar 26, 2025 - [ci] add misc and untested files in skipping
#51715 merged
Mar 26, 2025 - [doc] Add hpu resource description in ray train related docs
#47241 merged
Mar 26, 2025 - [core]
test_job_isolation
passes even when exceptions are thrown#51694 merged
Mar 26, 2025 - [core][kuberay] Trigger kuberay release pipeline from rayci
#51539 merged
Mar 26, 2025 - [data] Integrate Ray Dataset with Daft Dataframe
#51531 merged
Mar 26, 2025 - [Feat][Core/Dashboard] Convert StateHead to subprocess module
#51676 merged
Mar 26, 2025 - [core] Fix all gcs variable shadowing
#51704 merged
Mar 26, 2025 - [Autoscaler] Update CoordinatorNodeProvider example
#51293 merged
Mar 26, 2025 - Fix Ray Client when 'uv run' runtime environment is used
#51683 merged
Mar 26, 2025 - [core] Fix all raylet variable shadowing
#51689 merged
Mar 26, 2025 - [docs] Update usage_lib.py guide link
#51681 merged
Mar 26, 2025 - [core][autoscaler][v2] do not removing nodes for upcoming resource requests
#51570 merged
Mar 26, 2025 - [CI] Update LLM dependencies list and make the uv compile test job hard fail
#51693 merged
Mar 26, 2025 - [core] Fix incorrect comment
#51575 merged
Mar 25, 2025 - [tests] Reassign dashboard tests to core team
#51691 merged
Mar 25, 2025 - [core] Introduce ConcurrentFlatMap and use for InMemoryStoreClient
#50375 merged
Mar 25, 2025 - [Serve.llm] fix loading model from remote storage and add docs
#51617 merged
Mar 25, 2025 - [core] Record dashboard metrics with oneshot
#51627 merged
Mar 25, 2025 - [Feat][Core/Dashboard] Convert JobHead to subprocess module
#51553 merged
Mar 25, 2025 - [core] Avoid resize in GetAndPinArgsForExecutor
#51543 merged
Mar 25, 2025 - [release] Fix perf metrics compare
#51655 merged
Mar 25, 2025 - [Data][LLM] trust remote code
#51680 merged
Mar 25, 2025 - [core] Fix all variable shadowing for core worker
#51672 merged
Mar 25, 2025 - [core] Threaded actors get stuck forever if they receive two exit signals
#51582 merged
Mar 25, 2025 - [core] [easy] Mark cgroup tests exclusive
#51654 merged
Mar 25, 2025 - [Data] Support async callable classes in flat_map()
#51180 merged
Mar 25, 2025 - [Data] fix RandomAccessDataset.multiget returning unexpected values for missing keys
#44769 merged
Mar 25, 2025 - [data] fix lance ut failed
#51421 merged
Mar 25, 2025 - [core] Correct the wording in the OnNodeDead logs to avoid confusion
#51668 merged
Mar 25, 2025 - [ci] add an always tag for cond testing
#51662 merged
Mar 25, 2025 - [deps] Use UV to compile LLM dependencies
#51323 merged
Mar 25, 2025 - [data] Update repartition on target_num_rows_per_block documentation
#51433 merged
Mar 25, 2025 - [data] Fix Databricks host URL handling in Ray Data
#49926 merged
Mar 25, 2025 - [serve] Remove RAY_SERVE_ENABLE_QUEUE_LENGTH_CACHE flag
#51649 merged
Mar 24, 2025 - [ray.llm] Refactor model download utilities
#51604 merged
Mar 24, 2025 - refactor replica _handle_errors_and_metrics
#51644 merged
Mar 24, 2025 - [ci] Enable Cgroup support in CI for core
#51454 merged
Mar 24, 2025 - [Test][KubeRay] Add doctest for RayCluster Quickstart doc
#51249 merged
Mar 24, 2025 - Skip multiplex metrics and proxy status code is error tests on windows
#51645 merged
Mar 24, 2025 - [serve] update deployment status docs
#51610 merged
Mar 24, 2025 - [RLlib] Make min/max env steps per evaluation sample call configurable for duration="auto".
#51637 merged
Mar 24, 2025 - Fix Ray Train release test
#51624 merged
Mar 24, 2025 - [serve] don't stop retrying replicas when a deployment is scaling back up from zero
#51600 merged
Mar 24, 2025 - [core] Fix test_threaded_actor flaky on mac
#51602 merged
Mar 24, 2025 - Fix syntax errors in Ray Tune example pbt_ppo_example.ipynb
#51626 merged
Mar 23, 2025 - [core] [easy] [noop] Add comments on client call
#51614 merged
Mar 22, 2025 - [Doc][KubeRay] Add a doc to explain why some worker Pods are not ready in RayService
#51095 merged
Mar 22, 2025 - [Feat][Core/Dashboard] Convert EventHead to subprocess module
#51587 merged
Mar 22, 2025 - [Core] Cover cpplint for /src/ray/core_worker (excluding transport)
#51557 merged
Mar 22, 2025 - [Docs][Core] Update system logs doc for dashboard subprocess module
#50984 merged
Mar 22, 2025 - [Feat][Core/Dashboard] Convert DataHead to subprocess module
#51507 merged
Mar 22, 2025 - Add TorchDataLoader to Train Benchmark
#51456 merged
Mar 22, 2025 - [core] [easy] [no-op] Fix rotation comment
#51606 merged
Mar 21, 2025 - [release-automation] Add option to add build tag when uploading wheels to pypi
#51517 merged
Mar 21, 2025 - [serve][tests] Add a timeout for resnet app image request
#51569 merged
Mar 21, 2025 - [serve][test] Change the response_time_s to response_time_ms
#51566 merged
Mar 21, 2025 - Fix broken doctest build
#51594 merged
Mar 21, 2025 - [Serve.llm] Add gen config related doc
#51572 merged
Mar 21, 2025 - [CI] Upgrade pytest-aiohttp to 1.1.0
#51556 merged
Mar 21, 2025 - [Data] Removing usages of the deprecated
use_legacy_format
param#51563 merged
Mar 21, 2025 - [llm] ray.llm support custom accelerators
#51359 merged
Mar 21, 2025 - [Doc] Clarify the relation between 'uv run' and 'uv pip' support
#51599 merged
Mar 21, 2025 - [Data] Adding more ops to
BlockColumnAccessor
#51571 merged
Mar 21, 2025 - [ray.data.llm] Propose log_input_column_names()
#51441 merged
Mar 21, 2025 - Move experimental and OOM tests to core builds
#51525 merged
Mar 21, 2025 - Add perf metrics for 2.44.0
#51427 merged
Mar 21, 2025 - [core] Make testable stream redirection
#51191 merged
Mar 21, 2025 - [Feat][Core/Dashboard] Remove ReportEventService and replace with HTTP API
#51555 merged
Mar 21, 2025 - [docker] Update latest Docker dependencies for 2.44.0 release
#51581 merged
Mar 21, 2025 - [docker] Update latest Docker dependencies for 2.44.0 release
#51580 merged
Mar 21, 2025 - [Feat][Core/Dashboard] Redirect child process stdout and stderr to dashboard_[module_name].err
#51545 merged
Mar 21, 2025 - Give better error message if 'uv run' is combined with incompatible plugins
#51565 merged
Mar 21, 2025 - [core] Separate thread_pool into a new bazel target
#51549 merged
Mar 20, 2025 - [core] Fix launch_and_verify_clusters
#51438 merged
Mar 20, 2025
67 Pull requests opened by37 people
- [CI] Add ability to configure release tests with a matrix
#51562 opened
Mar 20, 2025 - Update observability.md
#51567 opened
Mar 20, 2025 - [WIP] [core] Rotate log monitor
#51573 opened
Mar 21, 2025 - Bump gradio from 3.50.2 to 5.22.0 in /python/requirements
#51577 opened
Mar 21, 2025 - Bump vllm from 0.7.2 to 0.8.1 in /python
#51603 opened
Mar 21, 2025 - [docs] Feature: adopt llms.txt convention
#51605 opened
Mar 21, 2025 - Bump pytorch-lightning from 1.8.6 to 2.4.0 in /python
#51607 opened
Mar 21, 2025 - Bump flask-cors from 4.0.0 to 4.0.2 in /python
#51609 opened
Mar 21, 2025 - [serve] reorg
#51611 opened
Mar 21, 2025 - Bump torch from 2.0.1 to 2.4.0 in /doc/source/templates/testing/docker/03_serving_stable_diffusion
#51612 opened
Mar 21, 2025 - Bump mlflow from 2.9.2 to 2.20.3 in /python/requirements/ml
#51613 opened
Mar 22, 2025 - Bump gunicorn from 20.1.0 to 23.0.0 in /python
#51615 opened
Mar 22, 2025 - [core] grpc stub manager
#51616 opened
Mar 22, 2025 - correct the error msg for invalid env registering
#51621 opened
Mar 22, 2025 - [core] Large objects release test
#51625 opened
Mar 22, 2025 - [py_modules] Don't install the wheel package if it's already installed
#51629 opened
Mar 23, 2025 - [data] add hive catalog
#51638 opened
Mar 24, 2025 - [RLlib] MetricsLogger + Stats overhaul
#51639 opened
Mar 24, 2025 - [core] split scheduler into smaller targets to improve build performance
#51641 opened
Mar 24, 2025 - [Core][Prototype] Prototype Code for Event Buffer
#51648 opened
Mar 24, 2025 - Unify request cancellation errors to RequestCancelledError
#51650 opened
Mar 24, 2025 - [core] Fix actor reconstruction that depends on plasma object
#51653 opened
Mar 24, 2025 - Add image datasets to ray train benchmark
#51657 opened
Mar 24, 2025 - fix status codes on http proxy
#51658 opened
Mar 25, 2025 - this commit adds use of specific python3.9 version for development se…
#51663 opened
Mar 25, 2025 - [VMware][WCP provider][Part 3/3] Architecture documentation & uts for vsphere wcp provider
#51666 opened
Mar 25, 2025 - [WIP] [core] Force compilation error on variable shadow
#51669 opened
Mar 25, 2025 - update to protbuf-28.2, absl-20240722, grpc-1.67 and patch for windows
#51673 opened
Mar 25, 2025 - Add uv to Docker image
#51675 opened
Mar 25, 2025 - windows dev setup
#51678 opened
Mar 25, 2025 - [docs] Tune toc
#51684 opened
Mar 25, 2025 - [core] add actor labels to export events
#51687 opened
Mar 25, 2025 - [core] Avoid task spec copy for in order actor task submission
#51692 opened
Mar 25, 2025 - Revert "[cg] Move default device logic into channel utils (#51305)"
#51699 opened
Mar 26, 2025 - [core] Add worker process into application cgroup
#51701 opened
Mar 26, 2025 - [core][dashboard-agent] Fail fast if the dashboard agent fails to launch the HTTP server
#51705 opened
Mar 26, 2025 - Update `--labels` and add `--labels-from-file` options for Label Selector API
#51706 opened
Mar 26, 2025 - Add `label_selector` option to remote functions
#51707 opened
Mar 26, 2025 - [Docs][KubeRay] Add guide for writing KubeRay doctests
#51708 opened
Mar 26, 2025 - [Test][KubeRay] Add a deliberate failure test to ensure doctests fail on error
#51709 opened
Mar 26, 2025 - [data] Make sql cursor buffered
#51712 opened
Mar 26, 2025 - [ci] upgrade rayci version
#51713 opened
Mar 26, 2025 - [core] Lazily subscribe to node changes from workers
#51718 opened
Mar 26, 2025 - [Core] Native CPU affinity support for accelerators
#51719 opened
Mar 26, 2025 - [CI] Add ability to configure release tests with a matrix
#51721 opened
Mar 26, 2025 - [Data][LLM] Bump vLLM version to support new models
#51726 opened
Mar 26, 2025 - [WIP]
#51727 opened
Mar 26, 2025 - [train] differentiate between train v1 and v2 export data
#51728 opened
Mar 26, 2025 - [WIP] [core] Log rotation for dashboard
#51729 opened
Mar 27, 2025 - [WIP] [core] Log rotate monitor
#51731 opened
Mar 27, 2025 - [core][cgraph] Fix illegal memory access of cgraph when used in PP
#51734 opened
Mar 27, 2025 - [core] Set actor creation task's `num_returns` to 0 instead of 1
#51735 opened
Mar 27, 2025 - [Chore][Core/Dashboard] Remove TrainHead's dependency on DataOrganizer
#51739 opened
Mar 27, 2025 - [Core] Runtime env working_dir validation #51380
#51741 opened
Mar 27, 2025 - [core] Fix GCS target compilation
#51742 opened
Mar 27, 2025 - [WIP] [core] Fix root variable shadow
#51743 opened
Mar 27, 2025 - [data] support new pyiceberg version
#51744 opened
Mar 27, 2025 - [core] Avoid ray_common as dependency
#51745 opened
Mar 27, 2025 - [Chore][Autoscaler] Clarify disable_launch_config_check comment in StandardAutoscaler
#51751 opened
Mar 27, 2025 - [Test][Dashboard] Add API tests for MetricsHead module
#51752 opened
Mar 27, 2025 - [Test][KubeRay] Add doctest for RayJob Quickstart doc
#51756 opened
Mar 27, 2025 - [ci] Allow dependency failure for kuberay ci kickoff
#51760 opened
Mar 27, 2025 - [core] Remove unnecessary exporter depencencies
#51763 opened
Mar 27, 2025 - [Data] Remove lazy fixture
#51764 opened
Mar 27, 2025
51 Issues closed by17 people
- CI test linux://python/ray/tests:test_state_api_2 is flaky
#51736 closed
Mar 27, 2025 - CI test linux://python/ray/tests:test_job is flaky
#51738 closed
Mar 27, 2025 - CI test linux://python/ray/tests:test_asyncio_client_mode is flaky
#51659 closed
Mar 27, 2025 - CI test linux://python/ray/tests:test_asyncio is flaky
#51660 closed
Mar 27, 2025 - CI test linux://python/ray/tune:test_train_v2_integration is flaky
#49930 closed
Mar 27, 2025 - CI test linux://doc:doctest[train-gpu][gpu] is consistently_failing
#51740 closed
Mar 27, 2025 - AssertionError: Session name mismatch after restarting Ray cluster with cleanup steps
#51737 closed
Mar 27, 2025 - [Ray debugger] Ray dubbger does not work
#51670 closed
Mar 27, 2025 - CI test linux://python/ray/air:test_resource_manager_placement_group is consistently_failing
#51725 closed
Mar 27, 2025 - 【bug】Ray.data.write_parquet will write twice when use fsspec local filesystem
#49741 closed
Mar 26, 2025 - [data] iter_batches needs streaming operation
#49072 closed
Mar 26, 2025 - [data] Documentation is formatted incorrectly
#48974 closed
Mar 26, 2025 - CI test linux://python/ray/tests:test_array_asan is consistently_failing
#51714 closed
Mar 26, 2025 - CI test windows://python/ray/tests:test_actor_retry2 is flaky
#47415 closed
Mar 26, 2025 - [Ray Core] The node storing the actor will be kill unexpectedly when autoscaler is turned on
#46172 closed
Mar 26, 2025 - [Core] Problems with uv run and remote cluster
#51368 closed
Mar 26, 2025 - [core][autoscaler][v2] do not removing nodes for upcoming resource requests
#51321 closed
Mar 26, 2025 - CI test linux://rllib:learning_tests_multi_agent_pendulum_sac_multi_cpu is flaky
#47264 closed
Mar 26, 2025 - [data] async `flat_map`
#50329 closed
Mar 26, 2025 - [data] Async map_batches return empty result when execution_options.preserve_order = True
#51188 closed
Mar 26, 2025 - CI test linux://python/ray/dashboard:test_node is consistently_failing
#51618 closed
Mar 25, 2025 - CI test linux://python/ray/dashboard:test_dashboard is consistently_failing
#44917 closed
Mar 25, 2025 - CI test darwin://python/ray/tests:test_gcs_fault_tolerance is consistently_failing
#43777 closed
Mar 25, 2025 - CI test darwin://python/ray/tests:test_scheduling_performance is flaky
#44238 closed
Mar 25, 2025 - [Serve] Elastic Autoscaling Based on Cluster Resources with Customizable Scaling Logic
#49151 closed
Mar 25, 2025 - [Data] RandomAccessDataset.multiget return unexpected values for missing keys.
#44768 closed
Mar 25, 2025 - Release test training_ingest_benchmark-task=image_classification.skip_training failed
#51622 closed
Mar 25, 2025 - CI test windows://python/ray/tests:test_actor_client_mode is flaky
#51651 closed
Mar 25, 2025 - CI test linux://python/ray/data:test_huggingface is consistently_failing
#44516 closed
Mar 25, 2025 - [<Ray component: Data] - inconsistent URL handling in Ray's Databricks integration
#49925 closed
Mar 25, 2025 - [client] Documentation Python version behavior
#45339 closed
Mar 24, 2025 - Release test aws_cluster_launcher_minimal failed
#51443 closed
Mar 24, 2025 - Release test aws_cluster_launcher failed
#51437 closed
Mar 24, 2025 - [Serve] Serve no longer retries deployments after 3 failures
#50710 closed
Mar 24, 2025 - [Data/preprocessors] Allow preprocessors to be append operations
#48133 closed
Mar 24, 2025 - CI test darwin://python/ray/tests:test_task_metrics is flaky
#48278 closed
Mar 22, 2025 - [Core] Cover cpplint for `ray/core_worker` (excluding transport)
#51510 closed
Mar 22, 2025 - CI test linux://python/ray/serve/tests/unit:test_pow_2_replica_scheduler is flaky
#48736 closed
Mar 22, 2025 - [core] Place unit test alongside with the implementation
#51152 closed
Mar 21, 2025 - CI test linux://python/ray/data:test_parquet is flaky
#48152 closed
Mar 21, 2025 - CI test linux://python/ray/data:test_metadata_provider is flaky
#51436 closed
Mar 21, 2025 - [Core] API Reference: uv
#51195 closed
Mar 21, 2025 - CI test linux://python/ray/tests:test_object_spilling_2_debug_mode is consistently_failing
#49143 closed
Mar 21, 2025 - [core] Compiled Graphs has a dependence on pyarrow
#51595 closed
Mar 21, 2025 - Ray client connection timeout on ray.init
#51591 closed
Mar 21, 2025 - [Core] Failed to use uv
#51196 closed
Mar 21, 2025
45 Issues opened by30 people
- [core][dashboard] don't use the first byte to determine whether a chunk succeeds or not in `get_log`
#51762 opened
Mar 27, 2025 - [Ray Serve] request timeout sec for grpc
#51761 opened
Mar 27, 2025 - Release test stress_test_state_api_scale.aws failed
#51759 opened
Mar 27, 2025 - [Data] Found Bug `add_column`
#51758 opened
Mar 27, 2025 - [Ray Data: Preprocessors] Support flattening vector features in concatenator
#51757 opened
Mar 27, 2025 - [Core] `uv sync` fails when running fork of Ray
#51755 opened
Mar 27, 2025 - CI test darwin://python/ray/tests:test_state_api_2 is consistently_failing
#51749 opened
Mar 27, 2025 - [RLLIB] PPO Gradient Removed for Value Estimation
#51748 opened
Mar 27, 2025 - [RLLIB] Offline Training
#51747 opened
Mar 27, 2025 - CI test darwin://python/ray/tests:test_state_api_log is consistently_failing
#51746 opened
Mar 27, 2025 - [<Ray component: Core>,C++] Ray worker process number keep increasing if calling actor from workers
#51711 opened
Mar 26, 2025 - CI test darwin://python/ray/dashboard:test_cli_integration is flaky
#51710 opened
Mar 26, 2025 - Outdated docs
#51703 opened
Mar 26, 2025 - DQN training with 2.44.0 with gymnasium.env and action mask giving error
#51700 opened
Mar 26, 2025 - [core] Ray status doesn't work for all ostream-printable objects
#51695 opened
Mar 25, 2025 - [core][gpu-objects] intra-process communication
#51685 opened
Mar 25, 2025 - AttributeError observed in sample_func when executed from a different python source file
#51679 opened
Mar 25, 2025 - [core/scheduler] replace :ray_common dep with sub-dependencies
#51677 opened
Mar 25, 2025 - [core] Add more compilation options and link options
#51671 opened
Mar 25, 2025 - [core] Ray status doesn't show source location where the error happens
#51667 opened
Mar 25, 2025 - [build] Avoid ODR issues
#51647 opened
Mar 24, 2025 - [Core] Expose `tags` parameter for tasks/actors to be propagated to metrics
#51646 opened
Mar 24, 2025 - [core][gpu-objects] Support streaming to overlap computation / communication
#51643 opened
Mar 24, 2025 - [core] Unify `CoreWorker::Exit` and `CoreWorker::Shutdown`
#51642 opened
Mar 24, 2025 - [core/scheduler] Split giant ray core C++ target into small ones
#51634 opened
Mar 24, 2025 - [Serve] Ray Serve Autoscaling supports the configuration of custom-metrics and policy
#51632 opened
Mar 24, 2025 - RLlib new API stack false deprecation warning / MultiRLModuleSpec
#51630 opened
Mar 23, 2025 - [Core] Unable to build Ray wheel on Windows using Docker due to private image access issues
#51628 opened
Mar 23, 2025 - [RLlib] Incorrect error message for improper registering of custom env
#51620 opened
Mar 22, 2025 - [core] Split giant ray core C++ targets into small ones(plasma store)
#51619 opened
Mar 22, 2025 - [Cluster] Ray job submit/logs sporadically stops following logs
#51601 opened
Mar 21, 2025 - [Ray serve] StopAsyncIteration error thrown by ray when the client cancels the request
#51598 opened
Mar 21, 2025 - [CG, Core] Illegal memory access with Ray 2.44 and vLLM v1 pipeline parallelism
#51596 opened
Mar 21, 2025 - [cgraph] Support function nodes
#51593 opened
Mar 21, 2025 - [Cluster] Add uv to base images
#51592 opened
Mar 21, 2025 - [core] Combine multiple grpc connections into one
#51590 opened
Mar 21, 2025 - [core] Replace opencensus with opentelemetry (C++)
#51589 opened
Mar 21, 2025 - [Cluster] Split up monitor.log
#51586 opened
Mar 21, 2025 - [Cluster] Autoscaler frequently fails to scale down workers
#51585 opened
Mar 21, 2025 - [CG, Core] Add Ascend NPU Support for RCCL and CG
#51574 opened
Mar 21, 2025 - [Core] Ray Label Selector API Implementation Tracker
#51564 opened
Mar 20, 2025
787 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
- [Core] Making Object Store Fallback Directory Configurable
#51189 commented on
Mar 27, 2025 • 13 new comments - [Compiled Graph] Enhance Compile Graph with Multi-Device Support
#51032 commented on
Mar 27, 2025 • 9 new comments - [data] make random_sample() reproducible
#51401 commented on
Mar 26, 2025 • 8 new comments - [LLM Batch] SGLang engine stage and processor
#51409 commented on
Mar 27, 2025 • 7 new comments - [Core][Bug fix] Trigger local task scheduling after deleting bundle.
#51125 commented on
Mar 26, 2025 • 5 new comments - [core] support dir includes env for working dir
#50066 commented on
Mar 26, 2025 • 5 new comments - [Grafana] Enable `includeAll` for Grafana cluster variable
#51396 commented on
Mar 27, 2025 • 5 new comments - [data] adding snowflake connectors
#51429 commented on
Mar 26, 2025 • 5 new comments - [Core] Cover cpplint for ray/src/ray/common
#51551 commented on
Mar 26, 2025 • 5 new comments - [doc] minor/patch version update
#48626 commented on
Mar 26, 2025 • 2 new comments - [core] Always create a default executor
#51058 commented on
Mar 26, 2025 • 2 new comments - [core][compiled graphs] Support reduce scatter and all gather collective for GPU communicator in compiled graph
#50624 commented on
Mar 26, 2025 • 2 new comments - [Autoscaler][V2] Check IM instance_status before terminating nodes
#50707 commented on
Mar 26, 2025 • 2 new comments - [core] Use cord for sending objects
#51397 commented on
Mar 26, 2025 • 1 new comment - [core] Implement utils class to setup and cleanup cgroup folder
#49941 commented on
Mar 26, 2025 • 1 new comment - [data] Iceberg datasource read with pyiceberg 0.9 fix
#51453 commented on
Mar 26, 2025 • 1 new comment - [Core]Fix the issue of actor tasks hanging during resubmission
#46539 commented on
Mar 27, 2025 • 0 new comments - [RLlib] Deprecate algo config (python) dicts; must be `AlgorithmConfig` objects.
#46896 commented on
Mar 26, 2025 • 0 new comments - [Data] Enable streaming json read
#46550 commented on
Mar 26, 2025 • 0 new comments - Fix mlflow artifact logging
#46570 commented on
Mar 26, 2025 • 0 new comments - [ADAG] Fix DAG input
#46604 commented on
Mar 26, 2025 • 0 new comments - [dashboard] Place the submit job on a separate page
#46613 commented on
Mar 26, 2025 • 0 new comments - [core] GcsPublisher bindings
#47062 commented on
Mar 26, 2025 • 0 new comments - [Core] If possible, force flush the trace when the worker ends.
#46654 commented on
Mar 26, 2025 • 0 new comments - Introducing StaleTaskError
#46705 commented on
Mar 26, 2025 • 0 new comments - Update rllib-env.rst
#46750 commented on
Mar 26, 2025 • 0 new comments - [core][dashboard] Change the StateDataSourceClient from using gRPC stub -> NewGcsClient.
#47056 commented on
Mar 26, 2025 • 0 new comments - [Core]Support Merge code search path from env variable
#46771 commented on
Mar 26, 2025 • 0 new comments - Add docs link to Serve page of Ray Dashboard
#46812 commented on
Mar 26, 2025 • 0 new comments - [POC] A Reactor style GCS. #1: GcsNodeManager
#46891 commented on
Mar 26, 2025 • 0 new comments - Updated LogVirtualView component removed react window
#46835 commented on
Mar 26, 2025 • 0 new comments - [ci][core] GCS FT Chaos test
#46996 commented on
Mar 26, 2025 • 0 new comments - Add generic item support for queue
#46849 commented on
Mar 26, 2025 • 0 new comments - [Dashboard]. Map logic resource data to row node
#46914 commented on
Mar 26, 2025 • 0 new comments - Verification to move PyG data to device
#46839 commented on
Mar 26, 2025 • 0 new comments - Added support for multiple callbacks for GcsSubscriber
#46958 commented on
Mar 26, 2025 • 0 new comments - [RLlib] - `"Synchronized"` sampling for multi-agent buffers.
#46083 commented on
Mar 26, 2025 • 0 new comments - [Data] Make the seed take effect in Dataset.random_sample()
#46088 commented on
Mar 26, 2025 • 0 new comments - Revert "[doc]Make vllm example works with latest vllm version"
#46094 commented on
Mar 26, 2025 • 0 new comments - Add PyFlyt waypoints example to documentation
#46145 commented on
Mar 26, 2025 • 0 new comments - [ADAG] Detect if ADAG is at capacity for execution
#46158 commented on
Mar 26, 2025 • 0 new comments - [Dashboard] Add cleanup of `job_table` in `delete_job`
#46173 commented on
Mar 26, 2025 • 0 new comments - [Dashboard] Add GPU component usage
#46188 commented on
Mar 26, 2025 • 0 new comments - [core] add ray.util.concurrent.futures.RayExecutor
#46249 commented on
Mar 26, 2025 • 0 new comments - [WIP] CI: jemalloc & mimalloc
#46271 commented on
Mar 26, 2025 • 0 new comments - [Dashboard] Display accelerators info on demand, add Huawei Ascend NPU monitoring.
#46287 commented on
Mar 26, 2025 • 0 new comments - [data] fix np.array crash the allocate mem error when souce include short an…
#46298 commented on
Mar 26, 2025 • 0 new comments - Update py_modules.py AttributeError: module has no attribute '__path__'
#46302 commented on
Mar 26, 2025 • 0 new comments - [Doc] Update directory path for installation
#46318 commented on
Mar 26, 2025 • 0 new comments - Enable RAY_DATA_ENABLE_TENSOR_EXTENSION_CASTING environment variable
#46344 commented on
Mar 26, 2025 • 0 new comments - fixed a typo in ValueError message for contains_tensor
#46348 commented on
Mar 26, 2025 • 0 new comments - [Docker] Upgrade base deps docker python env to 3.9.7
#46353 commented on
Mar 26, 2025 • 0 new comments - [test] cpp20
#46380 commented on
Mar 26, 2025 • 0 new comments - [Core] Add ray-start option 'session-name'
#46404 commented on
Mar 26, 2025 • 0 new comments - avoid merge errors when blocks contain different type in DelegatingBl…
#46407 commented on
Mar 26, 2025 • 0 new comments - [Core] Use real CPU count available to a Ray process
#46424 commented on
Mar 26, 2025 • 0 new comments - fix performance bug in arrow to numpy transform
#46433 commented on
Mar 26, 2025 • 0 new comments - [Doc][KubeRay] Add KubeRay image resize example to Ray doc page
#46447 commented on
Mar 26, 2025 • 0 new comments - python/ray/autoscaler/gcp/*.yaml: change scheduling from dict to list
#46500 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Optimize rnn_sequencing performance
#46502 commented on
Mar 26, 2025 • 0 new comments - [autoscaler][aws] Fix replace cloudwatch alarm config
#46537 commented on
Mar 26, 2025 • 0 new comments - Add Runhouse to Ecosystem
#47150 commented on
Mar 26, 2025 • 0 new comments - [RLlib|New API|Inconsistency] LSTM Encoder lacks the output Linear, but stated in the docstring (#47625)
#47626 commented on
Mar 26, 2025 • 0 new comments - [ADAG]Enable NPU (hccl) communication for CG
#47658 commented on
Mar 26, 2025 • 0 new comments - [wip] revive zero copy torch tensor serialization
#47665 commented on
Mar 26, 2025 • 0 new comments - [Data] Fix parallelism deriving heuristic to ensure parallelism stays w/in min/max bounds
#47695 commented on
Mar 26, 2025 • 0 new comments - [RLlib] PPO enhancement: Send samples as refs to n Learners (speedup for multi-node/multi-GPU learning).
#47707 commented on
Mar 26, 2025 • 0 new comments - Convert export events to proto and flush from background thread
#47713 commented on
Mar 26, 2025 • 0 new comments - [core]Make GCS InternalKV workload configurable to the Policy.
#47736 commented on
Mar 26, 2025 • 0 new comments - [Docs][hotfix] Correct the desc of nums of blocks
#47741 commented on
Mar 26, 2025 • 0 new comments - [core][aDAG] Fix cpu tensor is automatically converted to gpu tensor
#47742 commented on
Mar 26, 2025 • 0 new comments - Getinternalconfig and ioctx
#47756 commented on
Mar 26, 2025 • 0 new comments - [Docs] Update map_reduce.ipynb chunk_size
#47766 commented on
Mar 26, 2025 • 0 new comments - [RayCluster] Introduce how to run ray remote job with ray client (#47…
#47771 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Fix MeanStdFilter: Do not accumulate `num_pushes` for RunningStats when merging connector states.
#47794 commented on
Mar 26, 2025 • 0 new comments - Update vllm_openai_example.py for compatibility with latest vllm
#47835 commented on
Mar 26, 2025 • 0 new comments - Add new serve autoscaling parameter `scaling_function`
#47837 commented on
Mar 26, 2025 • 0 new comments - [observability][export-api] Write TrainRun events
#47888 commented on
Mar 26, 2025 • 0 new comments - Adding image_uri to docstring of runtime_env
#47905 commented on
Mar 26, 2025 • 0 new comments - [Docs] Update Volcano Integration with The New Flag
#47911 commented on
Mar 26, 2025 • 0 new comments - [ADAG] Fix none output.
#47918 commented on
Mar 26, 2025 • 0 new comments - [RLlib; Offline RL] Enable GPU and multi-GPU training for offline algorithms.
#47929 commented on
Mar 26, 2025 • 0 new comments - [Azure][Cluster] Check tags when provided
#47941 commented on
Mar 26, 2025 • 0 new comments - [autoscalerv2] use replicas in workerGroupSpecs as current workers number when initialize scale request to fix scale up target is wrong
#47967 commented on
Mar 26, 2025 • 0 new comments - (WIP) [ADAG] Support dag.experimental_compile(_custom_nccl_group= nccl_group) in aDAG
#47987 commented on
Mar 26, 2025 • 0 new comments - [data] add backpressure reason
#48009 commented on
Mar 26, 2025 • 0 new comments - [Jobs] Making sure `JobManager` retries `JobSupervisor.ping` before declaring job as failed
#47166 commented on
Mar 26, 2025 • 0 new comments - [core][dashboard] Make updates to DataSource.(node_workers|core_worker_stats) on delta.
#47186 commented on
Mar 26, 2025 • 0 new comments - add stop and delete button for jobs that are of submission type
#47189 commented on
Mar 26, 2025 • 0 new comments - [core] Make preloading Jemalloc configurable for worker
#47243 commented on
Mar 26, 2025 • 0 new comments - Add tensorflow support to numpy_to_tensor connector
#47246 commented on
Mar 26, 2025 • 0 new comments - [bazel] move python rules up
#47260 commented on
Mar 26, 2025 • 0 new comments - [core] Decouple create worker vs pop worker request.
#47268 commented on
Mar 26, 2025 • 0 new comments - [data] Fixed pyarrow error when the writer receives empty table
#47270 commented on
Mar 26, 2025 • 0 new comments - [RLlib; docs] New API stack docs: Add `ConnectorV2` documentation
#47278 commented on
Mar 26, 2025 • 0 new comments - idempotent replies by seq_no for sequential actors.
#47314 commented on
Mar 26, 2025 • 0 new comments - [Core][aDAG] Remove busy waiting semaphore acquire in linux
#47322 commented on
Mar 26, 2025 • 0 new comments - [todo] Migrate redis kv get sync
#47348 commented on
Mar 26, 2025 • 0 new comments - Remove unnecessary string literal splits
#47360 commented on
Mar 26, 2025 • 0 new comments - Return multiple best trials
#47381 commented on
Mar 26, 2025 • 0 new comments - [PoC] Dashboard with Heads as Actors.
#47414 commented on
Mar 26, 2025 • 0 new comments - [Core] Refine accelerator resource assessment for better node selection
#47443 commented on
Mar 26, 2025 • 0 new comments - [core][dashboard] make a flamegraph on event loop lag.
#47491 commented on
Mar 26, 2025 • 0 new comments - Improvements and Artificial Intelligence-based Improvements for Ray Cross-Language Functionality Testing
#47499 commented on
Mar 26, 2025 • 0 new comments - Enhancements to Ray Cross-Language Testing Script: Automated Error Detection, Data Input Checking, and System Efficiency Enhancement using Artificial Intelligence
#47558 commented on
Mar 26, 2025 • 0 new comments - [RLlib] add `SingleAgentRLModuleSpec` alias to `RLModuleSpec`
#47560 commented on
Mar 26, 2025 • 0 new comments - [Doc][KubeRay] Document Fields that Will Not Trigger Downtime in RayService
#47561 commented on
Mar 26, 2025 • 0 new comments - uint8_t* data ptr not used.
#47565 commented on
Mar 26, 2025 • 0 new comments - [Do not merge] Run release tests for export API
#47568 commented on
Mar 26, 2025 • 0 new comments - [Core][StreamingGenerator] Fix ray.get streaming object hang after node dead.
#47583 commented on
Mar 26, 2025 • 0 new comments - [RLLib] Fix action masking example
#44565 commented on
Mar 26, 2025 • 0 new comments - [RPC] Added appropriate keep-alive configuration for Ray's internal RPCs
#44612 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Fix invalid call of action_sampler_fn to support Keras 3
#44700 commented on
Mar 26, 2025 • 0 new comments - Update prometheus-grafana.md and add grafana support allowed_origins
#44701 commented on
Mar 26, 2025 • 0 new comments - Deflake test_threaded_actor
#44709 commented on
Mar 26, 2025 • 0 new comments - WIP: Futex
#44724 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Make RLlib learner support custom resources
#44732 commented on
Mar 26, 2025 • 0 new comments - [core] object store data transfer zstd
#44755 commented on
Mar 26, 2025 • 0 new comments - [DO NOT SUBMIT] debug sleep macos
#44759 commented on
Mar 26, 2025 • 0 new comments - [core] Add 5s timeout to the log and err subscriber polls.
#44761 commented on
Mar 26, 2025 • 0 new comments - [Core] Profile Ray start
#44818 commented on
Mar 26, 2025 • 0 new comments - Add Ray train dashboard head module with mock data
#44819 commented on
Mar 27, 2025 • 0 new comments - [WIP] Fixes the streaming generator hang on conn break
#44838 commented on
Mar 26, 2025 • 0 new comments - [DO NOT SUBMIT] Pr 44234
#44839 commented on
Mar 26, 2025 • 0 new comments - upgrade node to v20, latest LTS
#44860 commented on
Mar 26, 2025 • 0 new comments - [serve] update long running release tests
#44915 commented on
Mar 26, 2025 • 0 new comments - [ci][core] Add -flto and -fwhole-program-vtables
#44919 commented on
Mar 26, 2025 • 0 new comments - [core] add ray.util.concurrent.futures.RayExecutor
#44922 commented on
Mar 26, 2025 • 0 new comments - Debug tune repro
#44936 commented on
Mar 26, 2025 • 0 new comments - [ci] Adds promethesus latencies for ray_dashboard_api_requests_duration_seconds_bucket for the Ray Core Tests.
#44944 commented on
Mar 26, 2025 • 0 new comments - [WIP] Experimenting with streaming
#44959 commented on
Mar 26, 2025 • 0 new comments - [proof-of-concept][dashboard] One event loop per module
#44964 commented on
Mar 26, 2025 • 0 new comments - [Data] Support task reassignment in actor_pool_map_operator to improv…
#44968 commented on
Mar 26, 2025 • 0 new comments - [Data] Allow configuration of MAX_IMAGE_PIXELS in ImageDatasource
#45415 commented on
Mar 26, 2025 • 0 new comments - Windows python can not open file default_worker.py path with space
#33047 commented on
Mar 24, 2025 • 0 new comments - [Hack] Hack the pickle to make relpath in SCRIPT_MODE working dir
#43804 commented on
Mar 26, 2025 • 0 new comments - [WIP] In Driver CoreWorkerProcess, shutdown with the Exit method
#43833 commented on
Mar 26, 2025 • 0 new comments - [flakey] Deflakey `darwin://python/ray/tests:test_gcs_fault_tolerance`
#43922 commented on
Mar 26, 2025 • 0 new comments - [Core] Remove external storage upon sigterm for ray start
#43941 commented on
Mar 26, 2025 • 0 new comments - remove flaky marker from test
#44033 commented on
Mar 26, 2025 • 0 new comments - Fix for issue #43411 (BaseException error)
#44038 commented on
Mar 26, 2025 • 0 new comments - [misc] Reformat train/tune BUILD files
#44151 commented on
Mar 26, 2025 • 0 new comments - [misc] Reformat RLLib BUILD files
#44153 commented on
Mar 26, 2025 • 0 new comments - [Jobs] [Dashboard] Changing cluster address resolution in get_address_for_submission_client
#44186 commented on
Mar 26, 2025 • 0 new comments - [gRPC] Adding retry policies for all gRPC clients
#44234 commented on
Mar 26, 2025 • 0 new comments - Ray IPv6 support
#44252 commented on
Mar 26, 2025 • 0 new comments - [CI] Update kind version if it doesn't match pinned version
#44268 commented on
Mar 26, 2025 • 0 new comments - Debug reference_count
#44271 commented on
Mar 26, 2025 • 0 new comments - Deflake test
#44333 commented on
Mar 26, 2025 • 0 new comments - Retry on stream rpc lost
#44358 commented on
Mar 26, 2025 • 0 new comments - test
#44377 commented on
Mar 26, 2025 • 0 new comments - [WIP] Offload execution of sync methods to event-loop's default executor
#44406 commented on
Mar 26, 2025 • 0 new comments - change naming to intel gaudi habana for ray train example
#44412 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Cleanup `examples` folder 02: Add shared value function example script for MultiAgentRLModule.
#44421 commented on
Mar 26, 2025 • 0 new comments - Deflake test with long sleep
#44433 commented on
Mar 26, 2025 • 0 new comments - Update bert.ipynb
#44455 commented on
Mar 26, 2025 • 0 new comments - Remove SimpleImageViewer from EnvRunnerV2
#44466 commented on
Mar 26, 2025 • 0 new comments - [data] add better support for list-typed fields when using `write_bigquery`
#44564 commented on
Mar 26, 2025 • 0 new comments - blind try on ubuntu upgrade ..
#45427 commented on
Mar 26, 2025 • 0 new comments - [RLlib] DreamerV3 on PyTorch.
#45463 commented on
Mar 26, 2025 • 0 new comments - [train] Update Torch default timeout_s to use Torch's default timeout
#45501 commented on
Mar 26, 2025 • 0 new comments - Create a singleton io context and thread, and standalone gcs client on it.
#45524 commented on
Mar 26, 2025 • 0 new comments - [RLlib] fix VTrace in impala_tf_policy to support Keras 3
#45562 commented on
Mar 26, 2025 • 0 new comments - [WIP] [Core] Support of reading Working dir from HUAWEI Object Storage Service (OBS)
#45577 commented on
Mar 26, 2025 • 0 new comments - [core] Eagerly kill idle workers on job finish.
#45633 commented on
Mar 26, 2025 • 0 new comments - [core][1/2] Add SubscribeAllActors to GcsClient.
#45637 commented on
Mar 26, 2025 • 0 new comments - [core][2/2] Kill worker on root detached actor died.
#45638 commented on
Mar 26, 2025 • 0 new comments - [WIP][Jobs] Revisit Job Agent to run Job Supervisors in-process
#45664 commented on
Mar 26, 2025 • 0 new comments - [Core] Add warning when uploading large working dirs
#45818 commented on
Mar 26, 2025 • 0 new comments - [WIP] Benchmark data shuffle
#45847 commented on
Mar 26, 2025 • 0 new comments - Improve code snippet in docs to set up `ray[serve]` gRPC service
#45862 commented on
Mar 26, 2025 • 0 new comments - MADDPG framework should be TensorFlow
#45863 commented on
Mar 26, 2025 • 0 new comments - Enable setting OS disk size in Azure
#45867 commented on
Mar 26, 2025 • 0 new comments - Adds new working dir upload protocol PLASMA, and use it in job submission.
#45880 commented on
Mar 26, 2025 • 0 new comments - [spark] Fix nvidia-smi hanging issue
#45896 commented on
Mar 26, 2025 • 0 new comments - Fix ax_client.create_experiment call
#45902 commented on
Mar 26, 2025 • 0 new comments - Fix malformed `temp_dir` path when connecting Windows workers to cluster with Linux head
#45930 commented on
Mar 26, 2025 • 0 new comments - [URL] Change the absolute path to a relative path to solve the ingres…
#45933 commented on
Mar 26, 2025 • 0 new comments - [Data] Remove gaps between tasks in ray data.
#45935 commented on
Mar 26, 2025 • 0 new comments - [Serve] Group `DeploymentHandle` autoscaling metrics pushes by process
#45957 commented on
Mar 26, 2025 • 0 new comments - enable easy logging of images to tensorboard
#46068 commented on
Mar 26, 2025 • 0 new comments - add more execution and iteration metrics to prometheus
#44971 commented on
Mar 26, 2025 • 0 new comments - [WIP] poc / hack relpath
#45003 commented on
Mar 26, 2025 • 0 new comments - [WIP] add env var to enable debug
#45009 commented on
Mar 26, 2025 • 0 new comments - RuntimeContext support get actor namespace
#45025 commented on
Mar 26, 2025 • 0 new comments - Add roundtrip (ping-pong) microbenchmarks for accelerated DAG channels
#45064 commented on
Mar 26, 2025 • 0 new comments - [wip][train][tune] handle s3fs permissions
#45100 commented on
Mar 26, 2025 • 0 new comments - [Dashboard] Revisited `JobManager` log fetching infra to avoid blocking the event-loop
#45117 commented on
Mar 26, 2025 • 0 new comments - [Jobs] Revisit Ray Job execution and monitoring
#45120 commented on
Mar 26, 2025 • 0 new comments - [RLlib; Tune] Fix default behavior of default tune `CLIReporter` (based on `Algorithm._progress_metrics`).
#45122 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Fix async (multiprocessing) gymnasium vector envs in `SingleAgentEnvRunner`.
#45144 commented on
Mar 26, 2025 • 0 new comments - [RFC] Splitted Dashboard Heads.
#45175 commented on
Mar 26, 2025 • 0 new comments - Add descriptive error message when deployment name not found
#45181 commented on
Mar 26, 2025 • 0 new comments - [dashboard] Removes ray.rpc.ReportEventService and Dashboard head as gRPC server.
#45219 commented on
Mar 26, 2025 • 0 new comments - [Core] Improve logging during accelerator auto-detection
#45240 commented on
Mar 26, 2025 • 0 new comments - [core] Change all object_size to uint64_t and use 0 for unknown. Also adds a method `ray.experimental.get_local_object_locations`
#45247 commented on
Mar 26, 2025 • 0 new comments - grid_search resolution code optimization
#45267 commented on
Mar 26, 2025 • 0 new comments - [POC][core] GcsClient async binding, aka remove PythonGcsClient.
#45289 commented on
Mar 26, 2025 • 0 new comments - [Data] add reset pandas index when merge sorted blocks
#45326 commented on
Mar 26, 2025 • 0 new comments - add links to eks site for neuron examples
#45341 commented on
Mar 26, 2025 • 0 new comments - Modify Spark on Ray to support Pex and other virtualenvs + direct scr…
#45354 commented on
Mar 26, 2025 • 0 new comments - [WIP] Reopen cpp test on mac
#45374 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Enhance callbacks test case for EnvRunners; Add (optional) explicit `enable_multi_agent` setting to AlgorithmConfig.
#45385 commented on
Mar 26, 2025 • 0 new comments - [serve] allow build_serve_application to happen in parallel
#45394 commented on
Mar 26, 2025 • 0 new comments - add ray debugger references to ray docs
#45414 commented on
Mar 26, 2025 • 0 new comments - [dashboard] Actor and node head
#50159 commented on
Mar 26, 2025 • 0 new comments - fix: WandbLogger crashing silently on a FileNotFoundError
#50308 commented on
Mar 26, 2025 • 0 new comments - [Dashboard] Hide GPU and GRAM columns from clusters and actors table if there are 0 rows with GPUs
#50338 commented on
Mar 26, 2025 • 0 new comments - [core][1/N] Set gRPC deadline to ReportOCMetrics RPC
#50370 commented on
Mar 26, 2025 • 0 new comments - [data] add ClickHouse sink
#50377 commented on
Mar 27, 2025 • 0 new comments - [core] Move `overload remote` for actors
#50412 commented on
Mar 26, 2025 • 0 new comments - [Autoscaler][V2] Use running node instances to rate-limit upscaling
#50414 commented on
Mar 26, 2025 • 0 new comments - [tune] Remove loguniform's base
#50415 commented on
Mar 26, 2025 • 0 new comments - [core][cgraph] Support individual submit_timeout
#50424 commented on
Mar 26, 2025 • 0 new comments - [core] add RAY_IGNORE_VERSION_MISMATCH when ray start --address
#50513 commented on
Mar 26, 2025 • 0 new comments - Revert "[core][cgraph] Rework DagRef Destruction (#49818)"
#50529 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Enable spliting and zero padding of Dict observation
#50589 commented on
Mar 26, 2025 • 0 new comments - [Core] Split stats_metric into smaller targets to improve build performance
#50595 commented on
Mar 27, 2025 • 0 new comments - [chore] Delete unused build.sh
#50649 commented on
Mar 26, 2025 • 0 new comments - [doc][core] Fix ray generator code example
#50655 commented on
Mar 26, 2025 • 0 new comments - [WIP / try out] Use UV for Python 3.13 tests
#50669 commented on
Mar 26, 2025 • 0 new comments - [core] Cover cpplint for ray/src/ray/stats
#50678 commented on
Mar 26, 2025 • 0 new comments - Move `pydantic_compat` from `_private` to `_common`
#50683 commented on
Mar 26, 2025 • 0 new comments - [core] [wip attempt] StatusOr union construction sometimes breaks windows build
#50761 commented on
Mar 26, 2025 • 0 new comments - [WIP] Ray Collective Communication Lib Support HCCL Backend
#50790 commented on
Mar 26, 2025 • 0 new comments - [docs] add missing step to install KubeRay in gke-gcs-bucket.md
#50811 commented on
Mar 26, 2025 • 0 new comments - [WIP] Rebasing materialized dataset on iterator back-pressure is active upon materialization
#50880 commented on
Mar 26, 2025 • 0 new comments - [CI] Enable pretty-format-java pre-commit hook
#50957 commented on
Mar 26, 2025 • 0 new comments - Bump @babel/helpers from 7.19.4 to 7.26.10 in /python/ray/dashboard/client
#51268 commented on
Mar 26, 2025 • 0 new comments - [Tune] Fix pbt restore in synch mode
#48616 commented on
Mar 26, 2025 • 0 new comments - [autoscaler] Fix potential dead lock in local provider
#49909 commented on
Mar 26, 2025 • 0 new comments - [RFC][dashboard] Use aiohttp client for inter dependencies.
#49932 commented on
Mar 26, 2025 • 0 new comments - [core] minor optimization for JoinPaths
#49946 commented on
Mar 26, 2025 • 0 new comments - adding distributional critic example
#49949 commented on
Mar 26, 2025 • 0 new comments - [RLlib; Offline] - Add single learner gpu training with preloading in `OfflinePreLearner`.
#49960 commented on
Mar 26, 2025 • 0 new comments - Explicit comm
#49979 commented on
Mar 26, 2025 • 0 new comments - Add Semi-Random Weighting to AutoScaler Node Scheduler
#49983 commented on
Mar 26, 2025 • 0 new comments - [kuberay] fix deserialisation of custom resources in autoscaler config
#49993 commented on
Mar 26, 2025 • 0 new comments - [dashboard] Remove the dashboard grpc server.
#50021 commented on
Mar 26, 2025 • 0 new comments - [core] Thread-safe gcs node manager
#50024 commented on
Mar 26, 2025 • 0 new comments - [Core][Doc] Add support for Cambricon MLU
#50026 commented on
Mar 26, 2025 • 0 new comments - [Train] Add Cambricon MLU support to Ray Train
#50028 commented on
Mar 26, 2025 • 0 new comments - [WIP] Move execution loop to the same thread as the constructor of an actor
#50032 commented on
Mar 26, 2025 • 0 new comments - [RLlib] LearnerConnector pipeline speedup.
#50035 commented on
Mar 26, 2025 • 0 new comments - Add Cloud Logging example for Ray on GKE
#50060 commented on
Mar 26, 2025 • 0 new comments - Update multi-agent-envs.rst
#50075 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Make `config.episodes_to_numpy` False by default.
#50077 commented on
Mar 26, 2025 • 0 new comments - tsan
#50105 commented on
Mar 26, 2025 • 0 new comments - [dashboard] Remove DataSource.ndoes listeners in StateHead with get_all_node_info.
#50122 commented on
Mar 26, 2025 • 0 new comments - [core][collective] Avoid creation of `gloo_queue` in race condition
#50132 commented on
Mar 26, 2025 • 0 new comments - [dashboard] Move record_dashboard_metrics from MetricsHead to DashboardHead, remove .metrics property and convert MetricsHead.
#50133 commented on
Mar 26, 2025 • 0 new comments - [dashboard] Use cloudpickle to pickle SubprocessModule classes, and convert ServeHead.
#50153 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Disable callbacks callable check for new api stack
#50157 commented on
Mar 26, 2025 • 0 new comments - [doc][kuberay]: add `kubectl ray get node` example
#51271 commented on
Mar 26, 2025 • 0 new comments - [core] unblocking macos tests by pinning aiohappyeyeballs to version 2.4.8
#51288 commented on
Mar 26, 2025 • 0 new comments - [Dashboard] Fix timezones to deal with daylight savings
#51314 commented on
Mar 26, 2025 • 0 new comments - [CI] Replace `black` with `ruff format`
#51332 commented on
Mar 26, 2025 • 0 new comments - Deflake
#51338 commented on
Mar 26, 2025 • 0 new comments - [RLlib] - Add state syncing to EnvRunner sample call in APPO.
#51343 commented on
Mar 26, 2025 • 0 new comments - [Debugger] Random pick ray debugger port from range
#51344 commented on
Mar 26, 2025 • 0 new comments - [Dashboard] Support reporting AMD GPU usage
#51345 commented on
Mar 27, 2025 • 0 new comments - [RLlib] Throw better error if catalog can not be created for Atari environments, help atari users
#51371 commented on
Mar 26, 2025 • 0 new comments - [core] Create small objects release test
#51382 commented on
Mar 26, 2025 • 0 new comments - [core] Remove the unnecessary key `ActorID` of `concurrency_groups_cache_` in TaskReceiver
#51403 commented on
Mar 26, 2025 • 0 new comments - [data] Implement Spark-like accumulators for Ray Data
#51404 commented on
Mar 26, 2025 • 0 new comments - Upgrading Arrow dependency to latest stable version
#51440 commented on
Mar 26, 2025 • 0 new comments - expose ObjectRef from DeploymentResponse
#51444 commented on
Mar 26, 2025 • 0 new comments - [Data] Add environment variable support for Ray Data execution callbacks.
#51449 commented on
Mar 26, 2025 • 0 new comments - [serve] move serve image_uri tests to serve CI
#51451 commented on
Mar 26, 2025 • 0 new comments - [WIP][core][gpu-objects] CollectiveGroupManager
#51460 commented on
Mar 26, 2025 • 0 new comments - Unify `_private/log.py` and `_private/ray_logging`
#51461 commented on
Mar 26, 2025 • 0 new comments - [core] upgrading macos CI python 3.9 -> 3.9.2 to enable numpy serialization warnings
#51462 commented on
Mar 26, 2025 • 0 new comments - [ray.serve.llm] Support vLLM v1
#51490 commented on
Mar 26, 2025 • 0 new comments - [RLlib|Tune|Train] ValueError: Could not recover from checkpoint as it does not exist anymore
#51515 commented on
Mar 26, 2025 • 0 new comments - Avoid len(), which causes static batch sizes on export.
#51520 commented on
Mar 26, 2025 • 0 new comments - Add perf metrics for 2.44.1
#51535 commented on
Mar 26, 2025 • 0 new comments - [wip] add object detection notebooks
#50965 commented on
Mar 27, 2025 • 0 new comments - [WIP] Try upgrade cython to 3.1
#50972 commented on
Mar 26, 2025 • 0 new comments - fix restore BUG "RuntimeError: Expected scalars to be on CPU, got cud…
#50983 commented on
Mar 26, 2025 • 0 new comments - Fix editorconfig option name
#50993 commented on
Mar 26, 2025 • 0 new comments - Suppress type error
#50994 commented on
Mar 26, 2025 • 0 new comments - Improvements to General Debugging guide
#51004 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Schedule `AggregatorActors` via `PlacementGroupSchedulingStrategy` into Learner bundles.
#51017 commented on
Mar 26, 2025 • 0 new comments - [doc] add jax example
#51040 commented on
Mar 26, 2025 • 0 new comments - [WIP][core][compiled graphs] Supporting allreduce on tuple of tensors
#51047 commented on
Mar 26, 2025 • 0 new comments - [Misc]cupy.cuda.nccl.get_unique_id() generic modification.
#51052 commented on
Mar 26, 2025 • 0 new comments - [Refactor]Rename NCCL-related items to comm_backend
#51061 commented on
Mar 26, 2025 • 0 new comments - [Docs] Update docs to reflect CPU requests/limits change in KubeRay v1.3
#51072 commented on
Mar 26, 2025 • 0 new comments - [data] Make Dataset.name/set_name public
#51076 commented on
Mar 26, 2025 • 0 new comments - [DONOTMERGE] POC for Ray+torch.distributed
#51078 commented on
Mar 26, 2025 • 0 new comments - Fix the grammar of the OOM killer error messages
#51081 commented on
Mar 26, 2025 • 0 new comments - [do not merge] Add Daft to the Ray ecosystem page
#51133 commented on
Mar 26, 2025 • 0 new comments - Reproducing MacOS x86_64 Test Failure w/ Custom Numpy Serializer for ndarrays
#51143 commented on
Mar 26, 2025 • 0 new comments - [core] Implement a universal printer
#51151 commented on
Mar 26, 2025 • 0 new comments - [Do Not Merge] Update the Test Script to Debug test_network_failure_e2e Flaky Test
#51153 commented on
Mar 26, 2025 • 0 new comments - Bump axios from 0.21.4 to 1.8.2 in /python/ray/dashboard/client
#51162 commented on
Mar 26, 2025 • 0 new comments - Bump jinja2 from 3.1.3 to 3.1.6 in /release
#51216 commented on
Mar 26, 2025 • 0 new comments - Bump keras from 2.15.0 to 3.9.0 in /python
#51256 commented on
Mar 26, 2025 • 0 new comments - [Train V2] Fold `v2.LightGBMTrainer` API into the public trainer class as an alternate constructor
#51265 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Add placement strategy to `EnvRunner` creation.
#51267 commented on
Mar 26, 2025 • 0 new comments - [WIP] Adding `transform` utility to `Operator`
#48620 commented on
Mar 26, 2025 • 0 new comments - (WIP) [core][compiled graphs] Unify code paths for NCCL P2P and collectives scheduling
#48649 commented on
Mar 26, 2025 • 0 new comments - [core][compiled graphs] Inter-execution overlap
#48659 commented on
Mar 26, 2025 • 0 new comments - [Core]: Fix ConnectionError on Autoscaler CR lookups in K8s clusters …
#48675 commented on
Mar 26, 2025 • 0 new comments - [Tune] Add OSS Vizier to Ray Tune
#48684 commented on
Mar 26, 2025 • 0 new comments - [core] Fix building Ray against modern Protobuf versions
#48724 commented on
Mar 26, 2025 • 0 new comments - [core] Enable STDERR custom formatting
#48742 commented on
Mar 26, 2025 • 0 new comments - docs: update ray tune section
#48769 commented on
Mar 26, 2025 • 0 new comments - docs: update ray serve section
#48770 commented on
Mar 26, 2025 • 0 new comments - removed limit on log sizes via sockets
#48780 commented on
Mar 26, 2025 • 0 new comments - [Fix][GCS] Implement reconnection for RedisContext
#48781 commented on
Mar 26, 2025 • 0 new comments - [core][autoscaler]Reset the failure count to avoid RayCluster aborting unexpectedly
#48797 commented on
Mar 26, 2025 • 0 new comments - [Data] Cleaned up & streamlined boundary sampling sequence to avoid conversion from Numpy to Python objects
#48825 commented on
Mar 26, 2025 • 0 new comments - [Build][Deps] Add new `ray[azure]` extra package
#48847 commented on
Mar 26, 2025 • 0 new comments - [core] cpp lint of object_manager
#48878 commented on
Mar 26, 2025 • 0 new comments - [Autoscaler][Placement Group] Skip placed bundle when requesting resource
#48924 commented on
Mar 26, 2025 • 0 new comments - [train] Make dataset argument covariant
#48999 commented on
Mar 26, 2025 • 0 new comments - [core] Lint cpp files in common
#49002 commented on
Mar 26, 2025 • 0 new comments - Slo track
#49007 commented on
Mar 26, 2025 • 0 new comments - support to clean worker table with maximum_gcs_dead_worker_cached_count
#49030 commented on
Mar 26, 2025 • 0 new comments - [Jobs] Add metric to track duration of jobs
#49035 commented on
Mar 26, 2025 • 0 new comments - [WIP][compiled graphs] Avoid extra data I/O if CPU data is static
#49042 commented on
Mar 26, 2025 • 0 new comments - [data] fix random_sample return different data in fixed seed
#49443 commented on
Mar 26, 2025 • 0 new comments - :bug: do not modify user-provided runtime_env
#48021 commented on
Mar 26, 2025 • 0 new comments - [RLlib; Offline RL] - Enable gpu inference on data workers.
#48041 commented on
Mar 26, 2025 • 0 new comments - [WIP][core] C++20 upgrade
#48044 commented on
Mar 26, 2025 • 0 new comments - [core] Add metrics for Task RSS HWM.
#48052 commented on
Mar 26, 2025 • 0 new comments - [gRPC] Fixing gRPC Server Call to be instantiated immediately for unbounded handlers
#48057 commented on
Mar 26, 2025 • 0 new comments - [data] preprocessor: use map_batches in MaxAbsScaler, MinMaxScaler, UniformKBinsDiscretizer
#48097 commented on
Mar 26, 2025 • 0 new comments - [Data] Fix a test that checks the "eliminate_build_output_blocks" optimization
#48119 commented on
Mar 26, 2025 • 0 new comments - [Data] Fix a bug in the ReorderRandomizeBlocksRule optimization rule
#48258 commented on
Mar 26, 2025 • 0 new comments - Docs deprecate workflow
#48261 commented on
Mar 26, 2025 • 0 new comments - Add kuberay operator addon to cmd in gke-gcs-bucket.md
#48268 commented on
Mar 26, 2025 • 0 new comments - [doc] Remove unused/unmaintained `doc/source/templates` folder
#48295 commented on
Mar 26, 2025 • 0 new comments - [doc] fix: Typo and missing import in doc
#48311 commented on
Mar 26, 2025 • 0 new comments - Fix invalid type for progress_reporter parameter of RunConfig
#48439 commented on
Mar 26, 2025 • 0 new comments - [Data] Fix block accessors' combine handling of duplicate columns
#48495 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Fix remote env runner request spam
#48499 commented on
Mar 26, 2025 • 0 new comments - [runtime env]: Integrating Omnitrace to Ray worker process
#48525 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Fix Algorithm with tune
#48529 commented on
Mar 26, 2025 • 0 new comments - Add Adaptive Scaling Feature for Distributed Task Scheduling
#48537 commented on
Mar 26, 2025 • 0 new comments - Overlap dynamic
#48545 commented on
Mar 26, 2025 • 0 new comments - [data] add opensearch datasource
#48555 commented on
Mar 26, 2025 • 0 new comments - [Docs][Collective] Fix examples to use init_collective_group and create_collective_group
#48570 commented on
Mar 26, 2025 • 0 new comments - [core] Introduces Postable for InternalKVInterface.
#48584 commented on
Mar 26, 2025 • 0 new comments - Dag bind order execution fix
#48603 commented on
Mar 26, 2025 • 0 new comments - [Core][Compiled Graph] Execute DAG on Actor's Main Thread
#48608 commented on
Mar 26, 2025 • 0 new comments - [Core] Persist the Driver Console Log When Job Execution Not Through Job API
#49452 commented on
Mar 26, 2025 • 0 new comments - [core][cgraph] Use threadpool and one io_context for mutable object provider
#49500 commented on
Mar 26, 2025 • 0 new comments - [train] add test for ScalingConfigV2 import
#49515 commented on
Mar 26, 2025 • 0 new comments - [ci] Remove redundant ML doctests from running in unit test pipelines
#49516 commented on
Mar 26, 2025 • 0 new comments - Overlap check deps
#49520 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Add NPU and HPU support to RLlib
#49535 commented on
Mar 26, 2025 • 0 new comments - [core][cgraph] Use cv instead of busy wait for next version
#49542 commented on
Mar 26, 2025 • 0 new comments - [core] Minor improvements to core worker get
#49567 commented on
Mar 26, 2025 • 0 new comments - [core] Don't get dashboard address after each dashboard connection failure
#49584 commented on
Mar 26, 2025 • 0 new comments - [Core] Streaming generator supports num_returns
#49586 commented on
Mar 26, 2025 • 0 new comments - [RLlib; docs] Docs do-over (new API stack): New `debugging.rst` page.
#49592 commented on
Mar 26, 2025 • 0 new comments - [Dashboard] Support multiple accelerator monitoring and flexible display
#49610 commented on
Mar 26, 2025 • 0 new comments - [core][docs] Lint some top level core docs
#49703 commented on
Mar 26, 2025 • 0 new comments - [Core] Add virtual cluster
#49717 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Fix broken stats accumulation for 'MeanStdFilter' connector.
#49718 commented on
Mar 26, 2025 • 0 new comments - Update dyn-req-batch.md with style edits
#49725 commented on
Mar 26, 2025 • 0 new comments - changes to get ray serve responding on REST API calls when distribute…
#49730 commented on
Mar 27, 2025 • 0 new comments - [DATA]Add custom resources in data autoscaling
#49756 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Flatten dict-typed observations before comparing them.
#49758 commented on
Mar 26, 2025 • 0 new comments - [KubeRay] support suspending worker groups in KubeRay autoscaler
#49768 commented on
Mar 26, 2025 • 0 new comments - [ci] remove pins in runtime_env usage in train examples
#49772 commented on
Mar 26, 2025 • 0 new comments - Pass checkpointable args through in tf_learner
#49861 commented on
Mar 26, 2025 • 0 new comments - New vsphere provider supporting Supervisor (k8s) cluster.
#49881 commented on
Mar 26, 2025 • 0 new comments - [core][compiled-graphs] Very hacky Gloo channel PoC
#49103 commented on
Mar 26, 2025 • 0 new comments - Update azure.md - Missing azure dependency
#49104 commented on
Mar 26, 2025 • 0 new comments - Fix memory issues caused by pyarrow.Dataset.to_batches.
#49124 commented on
Mar 26, 2025 • 0 new comments - [compiled grapn][doc] structure
#49134 commented on
Mar 26, 2025 • 0 new comments - [core] change all dynamic_pointer_cast to static_pointer_cast.
#49135 commented on
Mar 26, 2025 • 0 new comments - [core] Gcs asio minor improvements
#49169 commented on
Mar 26, 2025 • 0 new comments - [core][compiled-graphs] Gloo group
#49187 commented on
Mar 26, 2025 • 0 new comments - [data] fix nodeName When the network in KubeRay is set to hostnetwork
#49188 commented on
Mar 26, 2025 • 0 new comments - [Dashboard] stop ray submmited job through ui
#49201 commented on
Mar 26, 2025 • 0 new comments - Fix unpacking zip package treats "../" as the top_level_directory
#49204 commented on
Mar 26, 2025 • 0 new comments - [WIP] Remove VM cluster autoscaler docker implementation
#49238 commented on
Mar 26, 2025 • 0 new comments - [data] feat: Implement `ray.data.Dataset.offset`
#49274 commented on
Mar 26, 2025 • 0 new comments - [Data] Fix numpy to arrow conversion.
#49293 commented on
Mar 27, 2025 • 0 new comments - [wandb] Use wandb Run as a context manager
#49307 commented on
Mar 26, 2025 • 0 new comments - [Core] fail to download s3 py modules
#49332 commented on
Mar 26, 2025 • 0 new comments - [Serve] Improve serve deploy ignore behavior
#49336 commented on
Mar 26, 2025 • 0 new comments - [Fix][Core] Periodically check log message queue cleared before shutdown
#49337 commented on
Mar 26, 2025 • 0 new comments - [wip] Moving around
#49345 commented on
Mar 26, 2025 • 0 new comments - Update tune-search-spaces.rst to correct outdated api use
#49386 commented on
Mar 26, 2025 • 0 new comments - Adding input validation to ScalingConfig resources_per_worker
#49389 commented on
Mar 26, 2025 • 0 new comments - [Draft] [spark] Set "HOST_IP" environmental variable for Ray worker nodes
#49403 commented on
Mar 26, 2025 • 0 new comments - [core][compiled graphs] Support reduce scatter and all gather collective in compiled graph
#49404 commented on
Mar 26, 2025 • 0 new comments - [core][dashboard] Dashboard head modules as Actors.
#49432 commented on
Mar 26, 2025 • 0 new comments - [core][compiled-graphs] CachedChannel's inner channel must be provided
#49434 commented on
Mar 26, 2025 • 0 new comments - [Core] RFC: simplify CI testing
#34315 commented on
Mar 24, 2025 • 0 new comments - [Core] Incorrect detection of cpus
#34846 commented on
Mar 24, 2025 • 0 new comments - [RLlib] `ExternalMultiAgentEnv` yields error on `log_returns` with `multiagent_done_dict`
#35189 commented on
Mar 24, 2025 • 0 new comments - [core] Improve Process management in Raylet
#35252 commented on
Mar 24, 2025 • 0 new comments - [CI] `windows://python/ray/serve:test_metrics` is failing/flaky on master.
#35452 commented on
Mar 24, 2025 • 0 new comments - [RLlib] Windows CLI, cmd.exe, powershell parsing json arguments JSONDecodeError
#35492 commented on
Mar 24, 2025 • 0 new comments - [Tune] Better force UTF8 encoding when calling Open method
#34679 commented on
Mar 24, 2025 • 0 new comments - Ray can't use my resources correctly for parallelizing with OptunaSearch/Pandas/NumPy
#34834 commented on
Mar 24, 2025 • 0 new comments - Build fails on ppc64le architecture
#4309 commented on
Mar 24, 2025 • 0 new comments - [autoscaler] "Cannot perform an interactive login from a non TTY device" when trying to use a private docker registry
#7339 commented on
Mar 24, 2025 • 0 new comments - Invalid memory access in RedisAsioClient/RedisAsyncContext on shutdown
#9074 commented on
Mar 24, 2025 • 0 new comments - ray::IDLE processes persist if I disconnect and kill master process from IDE
#9528 commented on
Mar 24, 2025 • 0 new comments - Unable to connect to ray head running on linux from ray worker node on windows
#10362 commented on
Mar 24, 2025 • 0 new comments - Windows debugging on gdb does not work
#9827 commented on
Mar 24, 2025 • 0 new comments - [RFC] Logging shutdown process to all Ray components.
#13241 commented on
Mar 24, 2025 • 0 new comments - [CI] Upload Windows Status to flakey-tests.ray.io
#12168 commented on
Mar 24, 2025 • 0 new comments - Cannot call remote instance method of a superclass from within a different instance method of the superclass
#10899 commented on
Mar 24, 2025 • 0 new comments - [Bug] [Core] Unable to schedule fractional gpu jobs
#20933 commented on
Mar 24, 2025 • 0 new comments - __del__ magic method can't access class properties
#14285 commented on
Mar 24, 2025 • 0 new comments - [runtime env] raise exception for unsupported runtime_env features on Windows
#21435 commented on
Mar 24, 2025 • 0 new comments - [Bug] [RLlib] Custom metrics are not reported to Tune
#20938 commented on
Mar 24, 2025 • 0 new comments - [RLlib] League based PolicyMap across workers impacting scalability via memory use - Question/[Bug]
#21459 commented on
Mar 24, 2025 • 0 new comments - [Bug] MultiDiscrete very slow
#22507 commented on
Mar 24, 2025 • 0 new comments - [core][gpu-objects] CollectiveGroupManager
#51260 commented on
Mar 25, 2025 • 0 new comments - [core][gpu-objects] Driver tries to get the data from in-actor store
#51272 commented on
Mar 21, 2025 • 0 new comments - [PPOConfig] Utilising new API/models without matching documentation
#40201 commented on
Mar 24, 2025 • 0 new comments - [core] Recent windows test flakiness
#38413 commented on
Mar 24, 2025 • 0 new comments - [tune] Full /tmp/ folder, ray does not clean up
#41202 commented on
Mar 24, 2025 • 0 new comments - [Core] StreamingObjectRefGenerator not working over network
#41556 commented on
Mar 24, 2025 • 0 new comments - Building an executable using Ray and Cx_freeze
#42101 commented on
Mar 24, 2025 • 0 new comments - [dreamerv3] Get error when tuning custom env using dreamerv3
#42107 commented on
Mar 24, 2025 • 0 new comments - SAC Checkpoint Loading Error
#42651 commented on
Mar 24, 2025 • 0 new comments - [RLLib] custom TorchRLModule return action_dist but this results in an error
#42786 commented on
Mar 24, 2025 • 0 new comments - [rllib] How to evaluate rollouts when using frame stacking RLModule?
#42931 commented on
Mar 24, 2025 • 0 new comments - Core: ray.remote raises ValueError when used on torch IterableDataset
#42914 commented on
Mar 24, 2025 • 0 new comments - [RLlib] Calling AlgorithmConfig.build() for different algorithms inside the same execution context causes hard to debug issues.
#43087 commented on
Mar 24, 2025 • 0 new comments - Ray with PyInstaller
#27421 commented on
Mar 24, 2025 • 0 new comments - [Tune][Air] MLFlow Callback is incompatible with PB2
#27783 commented on
Mar 24, 2025 • 0 new comments - [RLlib] Resuming from checkpoint with DQN and epsilon greedy let timesteps start from 0 again
#28289 commented on
Mar 24, 2025 • 0 new comments - [RLlib] The “trajectory_view_api” does not support the DQN algorithm, and the program will run in error
#27609 commented on
Mar 24, 2025 • 0 new comments - [Jobs] Run jobs tests on Windows
#28316 commented on
Mar 24, 2025 • 0 new comments - [CI] A simple way to reproduce osx/linux/windows CI run failure locally
#29068 commented on
Mar 24, 2025 • 0 new comments - [Core] inspect_serializability bug - parent object serializable but bound method not
#29423 commented on
Mar 24, 2025 • 0 new comments - [Core] util.multiprocessing.pool scheduling inefficiencies, blocking behavior in imap and imap_unordered
#29453 commented on
Mar 24, 2025 • 0 new comments - [Core] Reference leakage somewhere after ray.shutdown()
#30089 commented on
Mar 24, 2025 • 0 new comments - [Core] util.multiprocessing.pool: imap and imap_unordered blocking on ray.wait even though processes are complete
#29466 commented on
Mar 24, 2025 • 0 new comments - [RLlib] Undesired memory growing when using convolutional neural network
#29699 commented on
Mar 24, 2025 • 0 new comments - [Core] Access violation on windows 11 when running modin workload
#30493 commented on
Mar 24, 2025 • 0 new comments - [RLlib] Using (gym) discrete and box spaces inside dict observation space throws ValueError: Expected flattened obs shape ...
#31525 commented on
Mar 24, 2025 • 0 new comments - [core][gpu-objects] CollectiveExecutor
#51261 commented on
Mar 25, 2025 • 0 new comments - [core][gpu-objects] Driver should order all collective calls to avoid deadlock
#51264 commented on
Mar 25, 2025 • 0 new comments - [RLlib] Silence external warnings
#24107 commented on
Mar 25, 2025 • 0 new comments - [core][gpu-objects] Support multiple tensors
#51550 commented on
Mar 25, 2025 • 0 new comments - [core][gpu-objects] Object contains multiple tensors and/or mix of CPU data and GPU tensors
#51274 commented on
Mar 25, 2025 • 0 new comments - [core][gpu-objects] Garbage collection for in-actor GPU objects
#51262 commented on
Mar 25, 2025 • 0 new comments - [Core] Deserialization of generic pydantic models
#47840 commented on
Mar 25, 2025 • 0 new comments - [Core] RayCheck failed: placement_group_resource_manager_->ReturnBundle(bundle_spec) Status not OK
#51124 commented on
Mar 25, 2025 • 0 new comments - [Autoscaler][V2] Updating max replicas while Pods are pending causes v2 autoscaler to hang
#50868 commented on
Mar 25, 2025 • 0 new comments - [data] importing ray.data closes logging handlers, breaking custom logging
#48846 commented on
Mar 26, 2025 • 0 new comments - [Core|Dataset] Ray job stuck with idle actors with no tasks
#45822 commented on
Mar 26, 2025 • 0 new comments - [Core] Default concurrency using concurrency groups
#46666 commented on
Mar 26, 2025 • 0 new comments - [Data] Adding streaming capability for `ray.data.Dataset.unique`
#51207 commented on
Mar 26, 2025 • 0 new comments - [Data] Filter operation changes schema of dataset
#51217 commented on
Mar 26, 2025 • 0 new comments - Core: Ray cluster nodes underutilization during autoscaling
#47355 commented on
Mar 26, 2025 • 0 new comments - [RFC] [Serve] Custom Scaling
#41135 commented on
Mar 26, 2025 • 0 new comments - [Feedback] Feedback for ray + uv
#50961 commented on
Mar 26, 2025 • 0 new comments - CI test linux://rllib:learning_tests_cartpole_dqn_gpu is flaky
#46683 commented on
Mar 26, 2025 • 0 new comments - [Serve] Consider custom resources in best-fit node selection for DeploymentScheduler in Ray Serve
#51361 commented on
Mar 26, 2025 • 0 new comments - [Core] Python 3.13 wheel
#49738 commented on
Mar 26, 2025 • 0 new comments - [train v2][tune] Migration Guide
#49454 commented on
Mar 26, 2025 • 0 new comments - CI test linux://rllib:learning_tests_multi_agent_pendulum_sac_multi_gpu is flaky
#47309 commented on
Mar 26, 2025 • 0 new comments - [Data] - read_parquet raises AWS Error NETWORK_CONNECTION during HeadObject operation: curlCode: 43
#35826 commented on
Mar 26, 2025 • 0 new comments - [Data] Progress bars are sometimes half completed even after the task is finished
#36490 commented on
Mar 26, 2025 • 0 new comments - Assertion Error on Seq lens for PPO with Attention only in evaluation.
#22266 commented on
Mar 24, 2025 • 0 new comments - [RLlib] PPO - ray.rllib.agents.ppo "Put Error"
#24307 commented on
Mar 24, 2025 • 0 new comments - [RLlib] Metrics not reported with Client/Server and env=None
#24601 commented on
Mar 24, 2025 • 0 new comments - [RLlib] Dequeue check() returned False
#25783 commented on
Mar 24, 2025 • 0 new comments - Ray component: Core: PoolActor processes hanging
#24784 commented on
Mar 24, 2025 • 0 new comments - Ray on kubernetes with custom image_uri is broken
#51423 commented on
Mar 24, 2025 • 0 new comments - [Autoscaler, data] Ray starts `AutoscalingRequester` even when using `enableInTreeAutoscaling`
#51559 commented on
Mar 24, 2025 • 0 new comments - [Core] build from source code guide is out of date
#43093 commented on
Mar 25, 2025 • 0 new comments - [cgraph] Support ray.wait() for CompiledDAGRef
#51391 commented on
Mar 25, 2025 • 0 new comments - [Workflow] Add Azure as one of the storage backend options?
#34910 commented on
Mar 25, 2025 • 0 new comments - [Train] Intermittent `UnpicklingError` when loading estimator/preprocessor from checkpoint
#33815 commented on
Mar 25, 2025 • 0 new comments - [Core] std::bad_alloc error using ray.init()
#33525 commented on
Mar 25, 2025 • 0 new comments - [RLlib] APPO gets extremely slow when run with >1 GPUs.
#50221 commented on
Mar 25, 2025 • 0 new comments - [RLlib] Tuner.restore() Not Restoring Training
#43266 commented on
Mar 25, 2025 • 0 new comments - [BUG] Ray dashboard client failed to build
#23548 commented on
Mar 25, 2025 • 0 new comments - [Core][Streaming generator] Support num_returns.
#46934 commented on
Mar 25, 2025 • 0 new comments - [llm] Roadmap for Data and Serve LLM APIs
#51313 commented on
Mar 25, 2025 • 0 new comments - [Serve] Detailed Analysis of Errors Related to 'Ray does not allocate any GPUs on the driver node' && 'No CUDA GPUs are available'
#51242 commented on
Mar 25, 2025 • 0 new comments - [Ray dashboard] Actors tab does not list actors under certain conditions
#47447 commented on
Mar 25, 2025 • 0 new comments - [<Ray component: Core|RLlib|etc...>] Support for Gradio version 4 on Ray Serve
#49245 commented on
Mar 25, 2025 • 0 new comments - [RFC] Async request support in Ray Serve
#32292 commented on
Mar 25, 2025 • 0 new comments - [Serve] Observability for proxy
#48184 commented on
Mar 25, 2025 • 0 new comments - [Serve] ingress decorator does not work with fastapi.APIRouter arg
#50372 commented on
Mar 25, 2025 • 0 new comments - [Serve] FastAPI ingress does not work with composable routers
#50373 commented on
Mar 25, 2025 • 0 new comments - [core][gpu-objects] IPC communication for processes on the same GPU
#51270 commented on
Mar 25, 2025 • 0 new comments - [RLlib] Possible bug in TorchSquashedGaussian plus associated feature request
#51544 commented on
Mar 21, 2025 • 0 new comments - [Core] get_user_temp_dir() Doesn't Honor the User Specified Temp Dir
#51218 commented on
Mar 21, 2025 • 0 new comments - [Core] Please provide better message where 'RuntimeError: Failed to unpickle serialized exception'
#49885 commented on
Mar 21, 2025 • 0 new comments - [Ray core] surface the node id/ip and other info (task name, etc.) in the stacktrace for a full object store
#50408 commented on
Mar 22, 2025 • 0 new comments - [Core] Spot preemption related retries do not count towards the max retries
#50640 commented on
Mar 22, 2025 • 0 new comments - [Core] Plugable storage backend besides Redis
#50656 commented on
Mar 22, 2025 • 0 new comments - [nsys plugin] How about add an option `name` to nsys dumped file
#50711 commented on
Mar 22, 2025 • 0 new comments - [Ray Core] Slow scheduling speed with IOError: Broken pipe
#50244 commented on
Mar 22, 2025 • 0 new comments - [Core] ray.util.ActorPool can get stuck in failing state with one bad actor
#50313 commented on
Mar 22, 2025 • 0 new comments - [compiled graph] Driver cannot participate in the NCCL group
#50423 commented on
Mar 22, 2025 • 0 new comments - [Core] Negative available resources
#50739 commented on
Mar 22, 2025 • 0 new comments - [core] Serve microbenchmarks occasionally crash with segfault or invalid memory access
#50802 commented on
Mar 22, 2025 • 0 new comments - [Core] Ray Data job hanging with flooded Cancelling stale RPC with seqno 125 < 127 error
#50814 commented on
Mar 22, 2025 • 0 new comments - [Core] calling remote function in `Future` callback breaks ray
#50980 commented on
Mar 22, 2025 • 0 new comments - [core] question about ray issue: 51051
#51554 commented on
Mar 22, 2025 • 0 new comments - [Ray component: Python|runtime_env]Pip install `whl` file faliure when a job reruns in the same cluster
#49059 commented on
Mar 22, 2025 • 0 new comments - [Data]: Categorizer fails with non uniform distributions
#50792 commented on
Mar 22, 2025 • 0 new comments - [core] ray.init does not work with local_mode on run_time envs.
#30273 commented on
Mar 23, 2025 • 0 new comments - [RLlib] Basic PPO script throws obscure error when building RLModule
#51333 commented on
Mar 23, 2025 • 0 new comments - [core] Split giant ray core C++ targets into small ones
#50586 commented on
Mar 24, 2025 • 0 new comments - [Train] Deepspeed + Triton 3.2.0 + Torch 2.6.0 has issues with Ray Train
#50406 commented on
Mar 24, 2025 • 0 new comments - [RLlib] Attribute error when trying to compute action after training Multi Agent PPO with New API Stack
#44475 commented on
Mar 24, 2025 • 0 new comments - [AIR] Sampling support for Ray Train/Ray Data
#31127 commented on
Mar 24, 2025 • 0 new comments - tmp directory path issue between Windows client and Linux Ray cluster head
#45010 commented on
Mar 24, 2025 • 0 new comments - [Core] Remove pg resource notation from user-facing APIs.
#31064 commented on
Mar 20, 2025 • 0 new comments - [Feature] Remote call timeout required.
#18916 commented on
Mar 20, 2025 • 0 new comments - [core][compiled graphs] Support inter-actor transfer on the same GPU with CUDA IPC
#50140 commented on
Mar 20, 2025 • 0 new comments - [tune] ClientObjectRef is not found for client
#46747 commented on
Mar 20, 2025 • 0 new comments - [core] Ray session conflicts with PyArrow+HDFS
#36415 commented on
Mar 21, 2025 • 0 new comments - [Core] Limited interoperability with JAX
#46760 commented on
Mar 21, 2025 • 0 new comments - [Core] Too many threads in ray worker
#36936 commented on
Mar 21, 2025 • 0 new comments - [Runtime Environment] Remove cached python libs, working dir etc
#47488 commented on
Mar 21, 2025 • 0 new comments - [Data] Infinite recursion in ansitowin32.py (under tqdm_ray)
#51337 commented on
Mar 21, 2025 • 0 new comments - [Data] `read_images` benchmark sometimes fails with `ArrowVariableShapedTensorArray` error
#49883 commented on
Mar 21, 2025 • 0 new comments - [Core] ray raises a "Failed to unpickle serialized exception" error when an OpenAI Authentication Error is raised in task
#43428 commented on
Mar 21, 2025 • 0 new comments - [core] Set OPENBLAS_NUM_THREADS to number of cpus automatically
#34724 commented on
Mar 21, 2025 • 0 new comments - [Core] Getting node id for usage in NodeAffinitySchedulingStrategy
#28195 commented on
Mar 21, 2025 • 0 new comments - [usability][Feature] Throw error message if resolved ip address doesn't match the localhost
#19052 commented on
Mar 21, 2025 • 0 new comments - [StateAPI] StateAPI request truncates recent elements
#50378 commented on
Mar 21, 2025 • 0 new comments - [core] ray.remote Decorator's Return Type Cannot Be Determined by Type Checkers
#50410 commented on
Mar 21, 2025 • 0 new comments - [Feature] [Performance] [Docs] Disabling object spilling is not documented
#21998 commented on
Mar 21, 2025 • 0 new comments - [Core] classmethod support for actors
#36986 commented on
Mar 21, 2025 • 0 new comments - [data] Cannot convert dict to PyArrow blocks
#42075 commented on
Mar 21, 2025 • 0 new comments - [Core] `ray.cancel` multiple ObjectRefs
#24559 commented on
Mar 21, 2025 • 0 new comments - [core][cluster launcher] Cluster launcher should use `docker run --gpus` if GPUs are autodetected on the worker node
#43231 commented on
Mar 21, 2025 • 0 new comments - [core][cluster launcher] Local clusters should stop Ray containers on `ray down`
#43232 commented on
Mar 21, 2025 • 0 new comments - [Core] Enable huge pages for object store
#51352 commented on
Mar 21, 2025 • 0 new comments - [Dashboard] Fix listing APIs to avoid truncating at 10k entities
#48251 commented on
Mar 21, 2025 • 0 new comments - [core] Upgrade grpc (Mar.15, 2025)
#51395 commented on
Mar 21, 2025 • 0 new comments - Ray tune on Mac M2/M1 never stop
#45797 commented on
Mar 24, 2025 • 0 new comments - [<Ray component: Serve>] Worker node is killed after starting with reason of missing too many heartbeat checks
#46548 commented on
Mar 24, 2025 • 0 new comments - [tune] Can't find driver_artifacts file
#46607 commented on
Mar 24, 2025 • 0 new comments - [RLlib] Can't create multi-agent external env with new API stack
#46961 commented on
Mar 24, 2025 • 0 new comments - [Tune] Cannot run the QuickStart example code on windows after installing Ray in conda enviroment, reporting FileNotFoundError
#46827 commented on
Mar 24, 2025 • 0 new comments - RLlib: dist_class is missed while I try to use Policy.learn_on_batch()
#47011 commented on
Mar 24, 2025 • 0 new comments - [RLlib] Flatten observations example doesn't work
#47127 commented on
Mar 24, 2025 • 0 new comments - Issue after pip install of ray tune
#47266 commented on
Mar 24, 2025 • 0 new comments - Install Ray version 1.5.2
#46776 commented on
Mar 24, 2025 • 0 new comments - ray issue
#47177 commented on
Mar 24, 2025 • 0 new comments - [RLLib] Expected scalars to be on CPU, got cuda:0 instead
#35640 commented on
Mar 24, 2025 • 0 new comments - [RLlib] PPO Memory Leak on Uneven CNN (conv) filters
#35866 commented on
Mar 24, 2025 • 0 new comments - [Rllib][Tune] AttributeError: 'TorchCategorical' object has no attribute 'log_prob' with PB2
#35923 commented on
Mar 24, 2025 • 0 new comments - [Tune] PB2 Checkpoint/Sync Path Compatibility with Windows
#36370 commented on
Mar 24, 2025 • 0 new comments - [Release process] Validating and uploading wheels is a pain and a error prone
#36522 commented on
Mar 24, 2025 • 0 new comments - [<Ray component: cluster>] Urllib3 warning messages cannot be blocked in Ray
#36577 commented on
Mar 24, 2025 • 0 new comments - [JAVA] Ray.init() failed when JNI load
#36637 commented on
Mar 24, 2025 • 0 new comments - UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-68: character maps to <undefined>
#36767 commented on
Mar 24, 2025 • 0 new comments - Unable to build Ray on Power (Error: key "3.9.16" not found in dictionary)
#37889 commented on
Mar 24, 2025 • 0 new comments - [RLlib|Tune] Cannot restore tune checkpoints in algorithm
#39785 commented on
Mar 24, 2025 • 0 new comments - [Core] Ray is slower than serial python
#40184 commented on
Mar 24, 2025 • 0 new comments - [RLlib] Torch Autoregressive Example does not work with gpu
#38645 commented on
Mar 24, 2025 • 0 new comments - [Tune] Font color in dark theme Jupyter Notebooks in VS Code
#40317 commented on
Mar 24, 2025 • 0 new comments - CheckpointConfig does not work on Windows
#37226 commented on
Mar 24, 2025 • 0 new comments - [Serve/Core] Raylet crash encountered in Serve during Actor termination
#51408 commented on
Mar 24, 2025 • 0 new comments - [Ray Data | Core ]
#51416 commented on
Mar 24, 2025 • 0 new comments - Cannot Install ray[rllib] on Python 3.13
#50226 commented on
Mar 24, 2025 • 0 new comments - for exporting r2d2+lstm to onnx, why is empty state being passed in?
#50166 commented on
Mar 24, 2025 • 0 new comments - Upgrade Windows CI docker image to use Windows 11 and more recent toolchains.
#49830 commented on
Mar 24, 2025 • 0 new comments - Upgrade windows CI AMI to use Windows 11
#49829 commented on
Mar 24, 2025 • 0 new comments - [RAY TRAIN] Force use of gloo in Windows
#49778 commented on
Mar 24, 2025 • 0 new comments - [RLlib][Windows] Windows Invalid Directory Name Error in Ray RLlib
#49477 commented on
Mar 24, 2025 • 0 new comments - [Data] Transient Parquet Fragment Serialization Error
#49082 commented on
Mar 24, 2025 • 0 new comments - BUILD: patch zlib for macos and protobuf for windows
#48794 commented on
Mar 24, 2025 • 0 new comments - [Distributed Debugger] Newly added breakpoint not works: Breakpoint in file that does not exist
#48778 commented on
Mar 24, 2025 • 0 new comments - [Ray + YOLOv8] YOLOv8 model.tune
#47859 commented on
Mar 24, 2025 • 0 new comments - [tune] Repeated runs don't get averaged by search algorithm
#47758 commented on
Mar 24, 2025 • 0 new comments - |RLlib] New API Stack: "local_gpu_idx 0 is not a valid GPU id or is not available."
#47364 commented on
Mar 24, 2025 • 0 new comments - [Core] Warning message output as error cannot be filtered/hidden; unexposed environmental variable
#43264 commented on
Mar 24, 2025 • 0 new comments - [BUG] Ray crashes my python process when the connected kernel goes away.
#43280 commented on
Mar 24, 2025 • 0 new comments - [RLlib + Tune] PermissionError: [WinError 5] Access is denied: '../.tmp_generator' -> '..basic-variant-state-..' while training with ``Tuner``
#43702 commented on
Mar 24, 2025 • 0 new comments - [RLlib] step() function called too early, could lead to inconsistencies
#44290 commented on
Mar 24, 2025 • 0 new comments - CI test windows://python/ray/tests:test_actor_retry is flaky
#43845 commented on
Mar 24, 2025 • 0 new comments - [<Ray component: syncer.py.>] Last sync command failed with the following error
#44320 commented on
Mar 24, 2025 • 0 new comments - [RLlib] make_multi_callbacks with new API stack error
#44386 commented on
Mar 24, 2025 • 0 new comments - [RLlib] EagerTFPolicyV2 wrongly calls overridden action_sampler_fn
#44671 commented on
Mar 24, 2025 • 0 new comments - [RLlib] New API Stack: Action masking not working with wrapper, default encoder config issue
#44780 commented on
Mar 24, 2025 • 0 new comments - Ray Weekly Release
#44276 commented on
Mar 24, 2025 • 0 new comments - [RLlib] TypeError converting batch (INFOS) to torch tensor with ConnectorV2
#44478 commented on
Mar 24, 2025 • 0 new comments - [WIP][Feature commit] Initial commit for supporting IPv6 stack in Ray Clus…
#40332 commented on
Mar 27, 2025 • 0 new comments - [core] Tidying up mmap and munmap a bit.
#40334 commented on
Mar 26, 2025 • 0 new comments - [Ray Train] Implement strict even-split of training workers for pretraining
#40442 commented on
Mar 26, 2025 • 0 new comments - Add dolly v2 instruction tuning ray train
#40455 commented on
Mar 26, 2025 • 0 new comments - [Doc] Add note in `ray submit` doc to recommend Ray Job API
#40500 commented on
Mar 26, 2025 • 0 new comments - [dashboard] ignore reinit error when getting dashboard url
#40545 commented on
Mar 26, 2025 • 0 new comments - [tune] link placement group doc
#40590 commented on
Mar 26, 2025 • 0 new comments - [Train] Support rank_zero_only uploading for Lightning RayTrainReportCallback
#40639 commented on
Mar 26, 2025 • 0 new comments - [Core] Add observability support to AcceleratorManager
#40749 commented on
Mar 26, 2025 • 0 new comments - [core] Fix windows conda activate with conda.bat as executable in conda path
#40779 commented on
Mar 26, 2025 • 0 new comments - [Serve] Fix Windows unit tests
#40812 commented on
Mar 27, 2025 • 0 new comments - [Core] [Cluster Launcher] Rename min/max_workers to min/max_worker_nodes
#40835 commented on
Mar 26, 2025 • 0 new comments - [Doc] Clarify that a recent version of nsight is needed
#40846 commented on
Mar 26, 2025 • 0 new comments - [Core] Add logs upon abrupt failure code path
#40849 commented on
Mar 26, 2025 • 0 new comments - [docs] Documentation fixes (logging and profiling)
#40915 commented on
Mar 26, 2025 • 0 new comments - [RFC v3] Ray Client2
#40990 commented on
Mar 26, 2025 • 0 new comments - [WIP] accelerated DAG
#40991 commented on
Mar 26, 2025 • 0 new comments - WIP Do Not Merge
#41025 commented on
Mar 26, 2025 • 0 new comments - Adapt the joblib backend for compatibility with `return_as=generator`
#41028 commented on
Mar 26, 2025 • 0 new comments - TPU pod autoscaling based on the TpuCommandRunner
#41065 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Add SUPER algorithm.
#41079 commented on
Mar 26, 2025 • 0 new comments - [docs][Serve] add text about pip-pack installation
#41088 commented on
Mar 26, 2025 • 0 new comments - [Core] mig auto detection
#41103 commented on
Mar 26, 2025 • 0 new comments - WIP recharts custom charting library
#41140 commented on
Mar 26, 2025 • 0 new comments - Fix docker gpu 2
#42426 commented on
Mar 26, 2025 • 0 new comments - [Core/Data] Name GPU Worker Processes
#40529 commented on
Mar 27, 2025 • 0 new comments - [docs] try enabling nitpicky
#39448 commented on
Mar 26, 2025 • 0 new comments - Fix for appo_torch_policy.py when used with attention_net
#39520 commented on
Mar 26, 2025 • 0 new comments - [CherryPick][Serve] Ignore cancel request when receving websocket.accept message (#39413)
#39625 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Issue 38560: New API (Learner) stack does not properly count steps trained/sampled.
#39628 commented on
Mar 26, 2025 • 0 new comments - [DEBUG] metric
#39659 commented on
Mar 26, 2025 • 0 new comments - Upgrade default AWS DLAMI
#39721 commented on
Mar 26, 2025 • 0 new comments - [RLLib-contrib] Implementation of RED-Q (Ensemble SAC) Algorithm in PyTorch
#39747 commented on
Mar 26, 2025 • 0 new comments - [core] Use futures to synchornize `parallel_memcopy`
#39755 commented on
Mar 26, 2025 • 0 new comments - fix: add cython async detection
#39762 commented on
Mar 26, 2025 • 0 new comments - [core][RFC] http based pure external client
#39771 commented on
Mar 26, 2025 • 0 new comments - [Data] Consolidate default fault tolerance options
#39797 commented on
Mar 26, 2025 • 0 new comments - [core] Fix a corner case where the RPC never return
#39801 commented on
Mar 26, 2025 • 0 new comments - [Cluster launcher] Make cluster Ray version match client Ray version by default
#39812 commented on
Mar 26, 2025 • 0 new comments - [core] Default actor object's callable method to ActorMethod.remote
#39826 commented on
Mar 26, 2025 • 0 new comments - [Logging] Fix Deduplication URL
#39830 commented on
Mar 26, 2025 • 0 new comments - [data] Fix map_batches on datasets with nested lists
#39869 commented on
Mar 26, 2025 • 0 new comments - [CI] [Doc] Add reminder to install setup hooks for linter
#39888 commented on
Mar 26, 2025 • 0 new comments - [Cluster launcher] disable verbose logs by default
#39930 commented on
Mar 26, 2025 • 0 new comments - static DAGs
#39956 commented on
Mar 26, 2025 • 0 new comments - [data] Make exceptions consistent when falling back to pandas
#39969 commented on
Mar 26, 2025 • 0 new comments - [core][RFC v2] HTTP based Ray Client
#40085 commented on
Mar 26, 2025 • 0 new comments - RFC: Add Julia Language support
#40098 commented on
Mar 26, 2025 • 0 new comments - [train] update some imports from ray.air to ray.train
#40171 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Adding action space scaling to Gaussian noise in exploration
#40281 commented on
Mar 26, 2025 • 0 new comments - [WIP][Streaming] Cleaning up streaming sequence
#42443 commented on
Mar 26, 2025 • 0 new comments - WIP [data] A streaming compatible implementation of repartition-by-column
#42477 commented on
Mar 26, 2025 • 0 new comments - [tune] remove tensorboardX upper bound
#42581 commented on
Mar 26, 2025 • 0 new comments - allow victoria metrics response message
#42620 commented on
Mar 26, 2025 • 0 new comments - [WIP][Serve] Revisited cancellation handling in Proxy to make sure response generator is properly cancelled
#42665 commented on
Mar 26, 2025 • 0 new comments - [DO NOT REVIEW, LONG TERM PR FOR CI] Pinterest main branch 2.9.1
#42672 commented on
Mar 26, 2025 • 0 new comments - [WIP] Experimental batching support in Streaming Generators
#42825 commented on
Mar 26, 2025 • 0 new comments - Fix None exception in evaluate.
#42858 commented on
Mar 26, 2025 • 0 new comments - [Data] Block compression for ArrowBlock
#42859 commented on
Mar 27, 2025 • 0 new comments - reduce lock mutex scope
#43067 commented on
Mar 26, 2025 • 0 new comments - [Core/Accelerated DAG] Support Gloo-based backend using Ray collective group.
#43096 commented on
Mar 26, 2025 • 0 new comments - [RLlib] ConnectorV2 API: Add heuristic action logits mixin example script.
#43107 commented on
Mar 26, 2025 • 0 new comments - [Core] ray.remote raises ValueError when used on torch IterableDataset
#43117 commented on
Mar 26, 2025 • 0 new comments - [docs] Add antipattern for nested ray.get
#43184 commented on
Mar 26, 2025 • 0 new comments - [docs][clusters] Improve instructions for GPU autodetection and manual cluster launching
#43219 commented on
Mar 26, 2025 • 0 new comments - Release performance regression 2.9.2/2.9.3
#43235 commented on
Mar 26, 2025 • 0 new comments - [Build] Add build for RH
#43335 commented on
Mar 26, 2025 • 0 new comments - [WIP] Use 3.9 in macos.
#43351 commented on
Mar 26, 2025 • 0 new comments - [WIP] prepend pickled function co_fileanme with "<ray remote>"
#43359 commented on
Mar 26, 2025 • 0 new comments - verify windows wheels.
#43442 commented on
Mar 26, 2025 • 0 new comments - [Doc] Add RAY_REDIS_CA_CERT description for GCS fault tolerance
#43478 commented on
Mar 26, 2025 • 0 new comments - [Core][CLI] fix ray status long decimal numbers
#43480 commented on
Mar 26, 2025 • 0 new comments - [core] Fix max_calls option when used on a worker that is part of a workflow
#43700 commented on
Mar 26, 2025 • 0 new comments - [WIP] Core worker shutdown then disconnect
#43759 commented on
Mar 26, 2025 • 0 new comments - [Core] gpu memory scheduling prototype
#41147 commented on
Mar 26, 2025 • 0 new comments - [RFC] Ray Client2 with original Ray APIs
#41323 commented on
Mar 26, 2025 • 0 new comments - [core] Refactors the delayed task resubmission.
#41351 commented on
Mar 26, 2025 • 0 new comments - Relax check_version_info to check for bytecode compatibility
#41373 commented on
Mar 26, 2025 • 0 new comments - Update metrics.py
#41385 commented on
Mar 27, 2025 • 0 new comments - [Core] Return the correct task ID when get_runtime_context is used in a background thread
#41397 commented on
Mar 26, 2025 • 0 new comments - [serve] Adjust the doc of the Serve Java API
#41398 commented on
Mar 26, 2025 • 0 new comments - [core] Vendor aiohttp and aiosignal for Ray.
#41426 commented on
Mar 26, 2025 • 0 new comments - [train] update XGBoost model format to UBJ
#41442 commented on
Mar 26, 2025 • 0 new comments - Feat/metric validation
#41478 commented on
Mar 26, 2025 • 0 new comments - [Cluster Launcher] Update head node commands to refer to which node they can be run from
#41490 commented on
Mar 26, 2025 • 0 new comments - docs: add user guide on KubeRay webhooks
#41527 commented on
Mar 26, 2025 • 0 new comments - [Core] Track resource per instance [1/n]
#41582 commented on
Mar 26, 2025 • 0 new comments - [doc] Add documentation guide for MPI on Ray.
#41626 commented on
Mar 26, 2025 • 0 new comments - [RFC v4] Ray Client2.
#41803 commented on
Mar 26, 2025 • 0 new comments - [Dashboard] Should specify the time range in job detail page for load the cluster status and scale metrics
#41828 commented on
Mar 26, 2025 • 0 new comments - add disk throughput test
#41882 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Significant performance improvement with curriculum learning when using a high number of rollout workers
#41910 commented on
Mar 26, 2025 • 0 new comments - [WIP] Multinode dag
#42059 commented on
Mar 26, 2025 • 0 new comments - Fix a bug where start_time could be None leading to a crash in TuneTerminalReporter
#42078 commented on
Mar 26, 2025 • 0 new comments - [draft] 100tb shuffle experimental
#42086 commented on
Mar 26, 2025 • 0 new comments - [WIP] Multinode DAG minus FT changes
#42173 commented on
Mar 26, 2025 • 0 new comments - [Tune][Air] Fix MLflowLoggerCallback to enable its use with PBT (#27783)
#42182 commented on
Mar 26, 2025 • 0 new comments - [rllib_contrib] Lagrangian PPO
#42365 commented on
Mar 26, 2025 • 0 new comments - [WIP][Tracing] Fixing tracing context injection
#42384 commented on
Mar 26, 2025 • 0 new comments - how to solve this problem
#50721 commented on
Mar 27, 2025 • 0 new comments - [distributed debugger] exception in regular remote worker function leading to access violation when debugger connects
#51010 commented on
Mar 27, 2025 • 0 new comments - raylet exited immediately because dashboard agent fialed
#49162 commented on
Mar 27, 2025 • 0 new comments - [Train] Crash at end of training
#51527 commented on
Mar 27, 2025 • 0 new comments - CI test windows://python/ray/serve/tests:test_metrics is flaky
#45843 commented on
Mar 27, 2025 • 0 new comments - RLlib: beta1 as a Tensor is not supported for capturable=False and foreach=True
#51560 commented on
Mar 27, 2025 • 0 new comments - Release test sort.regular failed
#50417 commented on
Mar 27, 2025 • 0 new comments - Release test sort.chaos failed
#49765 commented on
Mar 27, 2025 • 0 new comments - Release test random_shuffle.chaos failed
#49395 commented on
Mar 27, 2025 • 0 new comments - Release test random_shuffle.regular failed
#49383 commented on
Mar 27, 2025 • 0 new comments - [train] add model (pipeline) parallelism example
#22894 commented on
Mar 26, 2025 • 0 new comments - [Air] add Jax trainer
#25385 commented on
Mar 26, 2025 • 0 new comments - [RLlib] Fix Issue #25316: unconfigurable `dist_dim` for custom multi-action distributions
#25490 commented on
Mar 26, 2025 • 0 new comments - [Core] Allow task retry for `ray.cancel`
#26254 commented on
Mar 26, 2025 • 0 new comments - [RLLib][Air] MLFlow parsing of RLLib evaluation and custom metrics
#26711 commented on
Mar 26, 2025 • 0 new comments - [State API] Add input & output size to the task API
#31898 commented on
Mar 26, 2025 • 0 new comments - [core][bugfix] catch BaseException
#32105 commented on
Mar 26, 2025 • 0 new comments - [core] Add support for one log per worker pool worker
#33167 commented on
Mar 26, 2025 • 0 new comments - [Core] Allow user to specify DASHBOARD_AGENT_LISTEN_PORT
#34886 commented on
Mar 26, 2025 • 0 new comments - Refactor run_release_test.sh
#35065 commented on
Mar 26, 2025 • 0 new comments - [Serve] Increase the GCS timeout
#35330 commented on
Mar 26, 2025 • 0 new comments - UX: rework stdout error when ray fails to start
#35378 commented on
Mar 26, 2025 • 0 new comments - [Dashboard] Provide a job dashboard URL link instead of the dashboard link when ray.init is called.
#35427 commented on
Mar 26, 2025 • 0 new comments - [1/2] [UI] Event Observability, add a new event table
#38638 commented on
Mar 27, 2025 • 0 new comments - [Data] Progress Bars are incomplete (notebooks + terminal)
#36181 commented on
Mar 26, 2025 • 0 new comments - [Data] Ray 2.6 created a breaking change in the index of a Modin DataFrame
#37771 commented on
Mar 26, 2025 • 0 new comments - [data] ray_tqdm does not work with numba
#45538 commented on
Mar 26, 2025 • 0 new comments - [Data] Row outputted returns 0
#48484 commented on
Mar 26, 2025 • 0 new comments - [Data] A proper way to handle random seed in random_sample() and other sampling for reproducibility
#50638 commented on
Mar 26, 2025 • 0 new comments - [Ray Data] Introduce "Key concepts" to Ray Data doc
#50018 commented on
Mar 26, 2025 • 0 new comments - [Data] Refactor `ParquetDatasink._write_partition_files` to use `pyarrow.parquet.write_to_dataset`
#50502 commented on
Mar 26, 2025 • 0 new comments - [Data]Extend Ray Data with read/write hive
#51094 commented on
Mar 26, 2025 • 0 new comments - [Data] supper passing `pyarrow.dataset.Expression`s to `Dataset.filter`'s `expr`
#50799 commented on
Mar 26, 2025 • 0 new comments - [data] RefBundle doesn't always eagerly free data
#37910 commented on
Mar 26, 2025 • 0 new comments - [Data] Add support for all Spark RDD transformations and actions
#10983 commented on
Mar 26, 2025 • 0 new comments - [Data, Train] ray::SplitCoordinator is very slow at every epoch + takes up too much memory
#49190 commented on
Mar 26, 2025 • 0 new comments - [Ray Data] Support S3 Tables
#49083 commented on
Mar 26, 2025 • 0 new comments - CI test linux://rllib:examples/connectors/mean_std_filtering_ppo is flaky
#47435 commented on
Mar 26, 2025 • 0 new comments - [Ray debugger] Unable to use debugger on slurm cluster
#51157 commented on
Mar 27, 2025 • 0 new comments - [Core] Runtime env working_dir validation
#51380 commented on
Mar 27, 2025 • 0 new comments - [<Ray component: java>] expose ObjectRef in DeploymentResponse class
#51445 commented on
Mar 27, 2025 • 0 new comments - [Ray Serve]: "RuntimeError: No CUDA GPUs are available" when running vllm with ray
#51193 commented on
Mar 27, 2025 • 0 new comments - [Serve] On kuberay, vLLM-0.7.2 reports "No CUDA GPUs are available" while vllm-0.6.6.post1 works fine when deploy rayservice
#51154 commented on
Mar 27, 2025 • 0 new comments - [Ray Serve] Expose public interface for user to customize the router
#50465 commented on
Mar 27, 2025 • 0 new comments - [Serve] Proxy actor not started on worker node when using kuberay
#50349 commented on
Mar 27, 2025 • 0 new comments - [Serve] make various default values of `AutoscalingConfig.max_replicas` consistent and >1
#50222 commented on
Mar 27, 2025 • 0 new comments - [Serve] exceptions raised by request timeout are inconsistent
#50992 commented on
Mar 27, 2025 • 0 new comments - CI test darwin://python/ray/tests:test_threaded_actor is flaky
#44663 commented on
Mar 27, 2025 • 0 new comments - CI test windows://python/ray/tests:test_reference_counting_2 is flaky
#45964 commented on
Mar 27, 2025 • 0 new comments - [2/2] [API] Event Observability, add a new event table
#38708 commented on
Mar 26, 2025 • 0 new comments - [ci][release][core] rewrite RuntimeEnvAgentClient with reusable TCP connection, also test_many_runtime_envs.py with env vars
#38772 commented on
Mar 26, 2025 • 0 new comments - [TEST] DEBUG
#38798 commented on
Mar 26, 2025 • 0 new comments - [Data] Add `read_delta` API to read Delta format files
#38813 commented on
Mar 26, 2025 • 0 new comments - [Core][Label scheduling 8/n]Add length and illegal letters validation to the node labels
#38824 commented on
Mar 26, 2025 • 0 new comments - Another debug
#38842 commented on
Mar 26, 2025 • 0 new comments - [Serve] Add more log in the router init step
#38933 commented on
Mar 27, 2025 • 0 new comments - updates to setup-dev.py to work around the types.py import issues
#38948 commented on
Mar 27, 2025 • 0 new comments - [WIP][Core] Unflake actor-cancel-test
#38975 commented on
Mar 27, 2025 • 0 new comments - [WIP] Streaming Generator + actor task lineage reconstruction
#38982 commented on
Mar 27, 2025 • 0 new comments - [Test] Fix torch dist nccl test
#38986 commented on
Mar 26, 2025 • 0 new comments - [experiment] rewrite PythonGcsClient with GcsClient
#39010 commented on
Mar 26, 2025 • 0 new comments - [spark] Improve Ray node memory config calculation logic
#39149 commented on
Mar 27, 2025 • 0 new comments - [RLlib] Support terminated and truncated in ExternalMultiAgentEnv
#39175 commented on
Mar 27, 2025 • 0 new comments - [RLlib] DreamerV3: Fix restore from checkpoint functionality
#39209 commented on
Mar 26, 2025 • 0 new comments - chore: update stale link and comment in tracing_helper.py
#39239 commented on
Mar 27, 2025 • 0 new comments - Yuming test
#39242 commented on
Mar 26, 2025 • 0 new comments - [Core] Fix get_next_unordered and get_next
#39250 commented on
Mar 26, 2025 • 0 new comments - [Core][Observability] Add the scheduling_strategy field to the ActorInfo for the "get actor info" API
#39256 commented on
Mar 26, 2025 • 0 new comments - Test p
#39297 commented on
Mar 26, 2025 • 0 new comments - fix typos in router.py
#39301 commented on
Mar 26, 2025 • 0 new comments - [templates/04_finetuning_llms_with_deepspeed] pin transformers to 4.31.0
#39372 commented on
Mar 26, 2025 • 0 new comments - [Serve][Debug] websocket test
#39389 commented on
Mar 26, 2025 • 0 new comments - Update pettingzoo_env.py
#39431 commented on
Mar 26, 2025 • 0 new comments - [WIP][Core]Add batch remote api for batch submit actor task
#35597 commented on
Mar 26, 2025 • 0 new comments - [Tune] Add optimizer kwargs for `SkOptSearch`
#36041 commented on
Mar 26, 2025 • 0 new comments - [autoscaler v2][6/n] introduce instance manager
#36066 commented on
Mar 26, 2025 • 0 new comments - add setting s3 endpoint-url via env var
#36114 commented on
Mar 27, 2025 • 0 new comments - Revert "Revert "[Core] Support Arrow zerocopy serialization in object…
#36153 commented on
Mar 27, 2025 • 0 new comments - [RLlib] Update `check_env` in env.py
#36463 commented on
Mar 27, 2025 • 0 new comments - [build_base] coroutine cpp
#36513 commented on
Mar 26, 2025 • 0 new comments - [ci] remove is_automated_build in setup.py
#36547 commented on
Mar 26, 2025 • 0 new comments - [RLlib] fix PPOConfig warning
#36595 commented on
Mar 27, 2025 • 0 new comments - [RLlib] fix custom policy examples
#36600 commented on
Mar 27, 2025 • 0 new comments - [RLlib] Fix A3C use_critic in `rllib_contrib`
#36613 commented on
Mar 27, 2025 • 0 new comments - WIP python protobuf removal
#36856 commented on
Mar 26, 2025 • 0 new comments - Add Bazel Steward for dependency management
#36863 commented on
Mar 26, 2025 • 0 new comments - [Runtime Env] working dir refactor
#36953 commented on
Mar 26, 2025 • 0 new comments - revert fix: pin libffi=3.3 for base-deps #33294
#37088 commented on
Mar 26, 2025 • 0 new comments - [core] Remove grpc ClientCallTag
#37140 commented on
Mar 26, 2025 • 0 new comments - [dag] Show both lib dependency installation instructions on import failure.
#37236 commented on
Mar 27, 2025 • 0 new comments - obj scale down
#37687 commented on
Mar 26, 2025 • 0 new comments - [Core] Network benchmark ip
#37810 commented on
Mar 27, 2025 • 0 new comments - [Data] Add support for ORC format
#37891 commented on
Mar 27, 2025 • 0 new comments - Enable mixed docker + non-docker clusters
#37968 commented on
Mar 26, 2025 • 0 new comments - [autoscaler] Use `bash` instead of `/bin/bash`
#38105 commented on
Mar 26, 2025 • 0 new comments - [tune] Fix error when move file with different disk types
#38403 commented on
Mar 27, 2025 • 0 new comments - Add Apple silicon GPU(mps) support to ray
#38464 commented on
Mar 26, 2025 • 0 new comments - fix
#38623 commented on
Mar 26, 2025 • 0 new comments