- Notifications
You must be signed in to change notification settings - Fork1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Automatically infer the PyTorch index via--torch-backend=auto
#12070
Conversation
9d06dfb
to3e85795
Compare@charliermarsh : Adapted from a few different sources - namely conda. I hope that illustrates my point better - why you need a plugin interface and you don't want to be the person responsible to maintain that 👍 # Copyright (C) 2012 Anaconda, Inc# SPDX-License-Identifier: BSD-3-Clause"""Detect CUDA version."""importctypesimportfunctoolsimportitertoolsimportmultiprocessingimportosimportplatformfromcontextlibimportsuppressfromdataclassesimportdataclassfromtypingimportOptional@dataclass()classCudaVersion:version:strarchitectures:list[str]defcuda_version()->Optional[CudaVersion]:# Do not inherit file descriptors and handles from the parent process.# The `fork` start method should be considered unsafe as it can lead to# crashes of the subprocess. The `spawn` start method is preferred.context=multiprocessing.get_context("spawn")queue=context.SimpleQueue()# Spawn a subprocess to detect the CUDA versiondetector=context.Process(target=_cuda_detector_target,args=(queue,),name="CUDA driver version detector",daemon=True, )try:detector.start()detector.join(timeout=60.0)finally:# Always cleanup the subprocessdetector.kill()# requires Python 3.7+ifqueue.empty():returnNoneresult=queue.get()ifresult:driver_version,architectures=result.split(";")result=CudaVersion(driver_version,architectures.split(","))returnresult@functools.lru_cache(maxsize=None)defcached_cuda_version():returncuda_version()def_cuda_detector_target(queue):""" Attempt to detect the version of CUDA present in the operating system in a subprocess. On Windows and Linux, the CUDA library is installed by the NVIDIA driver package, and is typically found in the standard library path, rather than with the CUDA SDK (which is optional for running CUDA apps). On macOS, the CUDA library is only installed with the CUDA SDK, and might not be in the library path. Returns: version string with CUDA version first, then a set of unique SM's for the GPUs present in the system (e.g., '12.4;8.6,9.0') or None if CUDA is not found. The result is put in the queue rather than a return value. """# Platform-specific libcuda locationsystem=platform.system()ifsystem=="Darwin":lib_filenames= ["libcuda.1.dylib",# check library path first"libcuda.dylib","/usr/local/cuda/lib/libcuda.1.dylib","/usr/local/cuda/lib/libcuda.dylib", ]elifsystem=="Linux":lib_filenames= ["libcuda.so",# check library path first"/usr/lib64/nvidia/libcuda.so",# RHEL/Centos/Fedora"/usr/lib/x86_64-linux-gnu/libcuda.so",# Ubuntu"/usr/lib/wsl/lib/libcuda.so",# WSL ]# Also add libraries with version suffix `.1`lib_filenames=list(itertools.chain.from_iterable((f"{lib}.1",lib)forlibinlib_filenames) )elifsystem=="Windows":bits=platform.architecture()[0].replace("bit","")# e.g. "64" or "32"lib_filenames= [f"nvcuda{bits}.dll","nvcuda.dll"]else:queue.put(None)# CUDA not available for other operating systemsreturn# Open libraryifsystem=="Windows":dll=ctypes.windllelse:dll=ctypes.cdllforlib_filenameinlib_filenames:withsuppress(Exception):libcuda=dll.LoadLibrary(lib_filename)breakelse:queue.put(None)return# Empty `CUDA_VISIBLE_DEVICES` can cause `cuInit()` returns `CUDA_ERROR_NO_DEVICE`# Invalid `CUDA_VISIBLE_DEVICES` can cause `cuInit()` returns `CUDA_ERROR_INVALID_DEVICE`# Unset this environment variable to avoid these errorsos.environ.pop("CUDA_VISIBLE_DEVICES",None)# Get CUDA versiontry:cuInit=libcuda.cuInitflags=ctypes.c_uint(0)ret=cuInit(flags)ifret!=0:queue.put(None)returncuDriverGetVersion=libcuda.cuDriverGetVersionversion_int=ctypes.c_int(0)ret=cuDriverGetVersion(ctypes.byref(version_int))ifret!=0:queue.put(None)return# Convert version integer to version stringvalue=version_int.valueversion_value=f"{value//1000}.{(value%1000)//10}"count=ctypes.c_int(0)libcuda.cuDeviceGetCount(ctypes.pointer(count))architectures=set()fordeviceinrange(count.value):major=ctypes.c_int(0)minor=ctypes.c_int(0)libcuda.cuDeviceComputeCapability(ctypes.pointer(major),ctypes.pointer(minor),device)architectures.add(f"{major.value}.{minor.value}")queue.put(f"{version_value};{','.join(architectures)}")exceptException:queue.put(None)returnif__name__=="__main__":print(cuda_version()) |
crates/uv-torch/src/lib.rs Outdated
| "torchserve" | ||
| "torchtext" | ||
| "torchvision" | ||
| "pytorch-triton" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can we add this list to some documentation? Reading the high-level overview I didn't realize we were hardcoding a package list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can we generate this by querying the PyTorch indices to see what they have? (Maybe a manually-run script that queries them and updates this list, or an automatically-run integration tests that makes sure this list is in sync with what's currently on their indices?)
Along those lines it would be helpful to have this list somewhere declarative. It might also be helpful to allow user-controlled overrides of this list if the set of packages changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Unfortunately I don't know that we can... We don't wantall packages on these indexes, because they include things likejinja2
. And in some cases, they includeincomplete packages likemarkupsafe
(where they only have a few wheels).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think this is a great idea.
Would it be worth naming this feature something likeuv-specialized-index
instead ofuv-torch
with an eye to extending it to other libraries in the future? (jaxlib and tensorflow, for instance, have current/popular versions on PyPI, but I think also have their own indees)?
crates/uv-torch/src/lib.rs Outdated
| "torchserve" | ||
| "torchtext" | ||
| "torchvision" | ||
| "pytorch-triton" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can we generate this by querying the PyTorch indices to see what they have? (Maybe a manually-run script that queries them and updates this list, or an automatically-run integration tests that makes sure this list is in sync with what's currently on their indices?)
Along those lines it would be helpful to have this list somewhere declarative. It might also be helpful to allow user-controlled overrides of this list if the set of packages changes.
I had a similar thought, I think this is one of many cases. Also considering when such indexes are mirrored or vendored internally. I was thinking what would be the right naming. I know some avenues refers to this as a |
Nevermind, didn't notice you were referring to CUDA.
💯 In my experience nvidia-smi can also take a long time depending on gpu load. Although there multiple locations depending on how (e.g. dkms) and environment (windows, osx) it's installed. For example, WSL 2 its even weirder due to the shared drivers with the host situation. So nvidia-smi might be the most sure-fire low risk way (assuming no issues with install). |
Definitely agree with moving this out of the interpreter query (and possibly reading it from outside I'm alittle wary of trying to brand this as something more general than |
10ecfd8
to3349921
Compare96038e9
toc9e4b20
CompareThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
deferring to@geofft for the new detect logic
Ok(None) => { | ||
debug!("Failed to parse CUDA driver version from `/proc/driver/nvidia/version`"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Should this case return an error instead of falling through tonvidia-smi
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'm not confident enough in the format of this one... It seems like it varies across machines.
if output.status.success() { | ||
let driver_version = Version::from_str(&String::from_utf8(output.stdout)?)?; | ||
debug!("Detected CUDA driver version from `nvidia-smi`: {driver_version}"); | ||
return Ok(Some(Self::Cuda { driver_version })); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
else {debug!("nvidia-smi returned error {output.status}: {output.stderr}")}
might be nice
14cb5ed
to9e40e0c
CompareThis MR contains the following updates:| Package | Update | Change ||---|---|---|| [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.6.5` -> `0.6.9` |MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).**Proposed changes to behavior should be submitted there as MRs.**---### Release Notes<details><summary>astral-sh/uv (astral-sh/uv)</summary>### [`v0.6.9`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#069)[Compare Source](astral-sh/uv@0.6.8...0.6.9)##### Enhancements- Use `keyring --mode creds` when `authenticate = "always"` ([#​12316](astral-sh/uv#12316))- Fail with specific error message when no password is present and `authenticate = "always"` ([#​12313](astral-sh/uv#12313))##### Bug fixes- Add boolish value parser for `UV_MANAGED_PYTHON` flags ([#​12345](astral-sh/uv#12345))- Make deserialization non-fatal when assessing source tree revisions ([#​12319](astral-sh/uv#12319))- Use resolver-returned wheel over alternate cached wheel ([#​12301](astral-sh/uv#12301))##### Documentation- Add experimental `--torch-backend` to the PyTorch guide ([#​12317](astral-sh/uv#12317))- Fix `#keyring-provider` references in alternative index docs ([#​12315](astral-sh/uv#12315))- Fix `--directory` path in examples ([#​12165](astral-sh/uv#12165))##### Preview changes- Automatically infer the PyTorch index via `--torch-backend=auto` ([#​12070](astral-sh/uv#12070))### [`v0.6.8`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#068)[Compare Source](astral-sh/uv@0.6.7...0.6.8)##### Enhancements- Add support for enabling all groups by default with `default-groups = "all"` ([#​12289](astral-sh/uv#12289))- Add simpler `--managed-python` and `--no-managed-python` flags for toggling Python preferences ([#​12246](astral-sh/uv#12246))##### Performance- Avoid allocations for default cache keys ([#​12063](astral-sh/uv#12063))##### Bug fixes- Allow local version mismatches when validating lockfile ([#​12285](astral-sh/uv#12285))- Allow owned string when deserializing `requires-python` ([#​12278](astral-sh/uv#12278))- Make cache errors non-fatal in `Planner::build` ([#​12281](astral-sh/uv#12281))### [`v0.6.7`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#067)[Compare Source](astral-sh/uv@0.6.6...0.6.7)##### Python- Add CPython 3.14.0a6- Fix regression where extension modules would use wrong `CXX` compiler on Linux- Enable FTS3 enhanced query syntax for SQLiteSee the [`python-build-standalone` release notes](https://github.com/astral-sh/python-build-standalone/releases/tag/20250317) for more details.##### Enhancements- Add support for `-c` constraints in `uv add` ([#​12209](astral-sh/uv#12209))- Add support for `--global` default version in `uv python pin` ([#​12115](astral-sh/uv#12115))- Always reinstall local source trees passed to `uv pip install` ([#​12176](astral-sh/uv#12176))- Render token claims on publish permission error ([#​12135](astral-sh/uv#12135))- Add pip-compatible `--group` flag to `uv pip install` and `uv pip compile` ([#​11686](astral-sh/uv#11686))##### Preview features- Avoid creating duplicate directory entries in built wheels ([#​12206](astral-sh/uv#12206))- Allow overriding module names for editable builds ([#​12137](astral-sh/uv#12137))##### Performance- Avoid replicating core-metadata field on `File` struct ([#​12159](astral-sh/uv#12159))##### Bug fixes- Add `src` to default cache keys ([#​12062](astral-sh/uv#12062))- Discard insufficient fork markers ([#​10682](astral-sh/uv#10682))- Ensure `python pin --global` creates parent directories if missing ([#​12180](astral-sh/uv#12180))- Fix GraalPy abi tag parsing and discovery ([#​12154](astral-sh/uv#12154))- Remove extraneous script packages in `uv sync --script` ([#​12158](astral-sh/uv#12158))- Remove redundant `activate.bat` output ([#​12160](astral-sh/uv#12160))- Avoid subsequent index hint when no versions are available on the first index ([#​9332](astral-sh/uv#9332))- Error on lockfiles with incoherent wheel versions ([#​12235](astral-sh/uv#12235))##### Rust API- Update `BaseClientBuild` to accept custom proxies ([#​12232](astral-sh/uv#12232))##### Documentation- Make testpypi index explicit in example snippet ([#​12148](astral-sh/uv#12148))- Reverse and format the archived changelogs ([#​12099](astral-sh/uv#12099))- Use consistent commas around i.e. and e.g. ([#​12157](astral-sh/uv#12157))- Fix typos in MRE docs ([#​12198](astral-sh/uv#12198))- Fix double space typo ([#​12171](astral-sh/uv#12171))### [`v0.6.6`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#066)[Compare Source](astral-sh/uv@0.6.5...0.6.6)##### Python- Add support for dynamic musl Python distributions on x86-64 Linux ([#​12121](astral-sh/uv#12121))- Allow the experimental JIT to be enabled at runtime on Python 3.13 and 3.14 on Linux- Upgrade the build toolchain to LLVM 20, improving performanceSee the [`python-build-standalone` release notes](https://github.com/astral-sh/python-build-standalone/releases/tag/20250311) for more details.##### Enhancements- Add `--marker` flag to `uv add` ([#​12012](astral-sh/uv#12012))- Allow overriding module name for uv build backend ([#​11884](astral-sh/uv#11884))- Sync latest Python releases ([#​12120](astral-sh/uv#12120))- Use 'Upload' instead of 'Download' in publish reporter ([#​12029](astral-sh/uv#12029))- Add `[index].authenticate` allowing authentication to be required on an index ([#​11896](astral-sh/uv#11896))- Add support for Windows legacy scripts in `uv tool run` ([#​12079](astral-sh/uv#12079))- Propagate conflicting dependency groups when using `include-group` ([#​12005](astral-sh/uv#12005))- Show ambiguous requirements when `uv add` failed ([#​12106](astral-sh/uv#12106))##### Performance- Cache workspace discovery ([#​12096](astral-sh/uv#12096))- Insert dependencies into fork state prior to fetching metadata ([#​12057](astral-sh/uv#12057))- Remove some allocations from `uv-auth` ([#​12077](astral-sh/uv#12077))##### Bug fixes- Avoid considering `PATH` updated when the `export` is commented in the shellrc ([#​12043](astral-sh/uv#12043))- Fix `uv publish` retry on network failures ([#​12041](astral-sh/uv#12041))- Use a sized stream in `uv publish` to comply with WSGI PyPI server constraints ([#​12111](astral-sh/uv#12111))- Fix `uv python install --reinstall` when the version was not previously installed ([#​12124](astral-sh/uv#12124))##### Preview features- Fix `uv_build` invocation ([#​12058](astral-sh/uv#12058))##### Documentation- Quote versions string in `python-versions.md` ([#​12112](astral-sh/uv#12112))- Fix tool concept page headings ([#​12053](astral-sh/uv#12053))- Update the `[index].authenticate` docs ([#​12102](astral-sh/uv#12102))- Update versioning policy ([#​11666](astral-sh/uv#11666))</details>---### Configuration📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.🔕 **Ignore**: Close this MR and you won't be reminded about this update again.--- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box---This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4xOTQuMCIsInVwZGF0ZWRJblZlciI6IjM5LjIwOS4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
Summary
This is a prototype that I'm considering shipping under
--preview
, based onlight-the-torch
.light-the-torch
patches pip to pull PyTorch packages from the PyTorch indexes automatically. And, in particular,light-the-torch
will query the installed CUDA drivers to determine which indexes are compatible with your system.This PR implements equivalent behavior under
--torch-backend auto
, though you can also set--torch-backend cpu
, etc. for convenience. When enabled, the registry client will fetch from the appropriate PyTorch index when it sees a package from the PyTorch ecosystem (and ignore any other configured indexes,unless the package is explicitly pinned to a different index).Right now, this is only implemented in the
uv pip
CLI, since it doesn't quite fit into the lockfile APIs given that it relies on feature detection on the currently-running machine.Test Plan
On macOS, you can test this with (e.g.):
On a GPU-enabled EC2 machine: