- Notifications
You must be signed in to change notification settings - Fork288
chore:Dockerfile-cuda
- Retain major CC when pruning static cuBLAS lib#635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
polarathene commentedJun 13, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
NOTE: There is no known need to do this for TEI, howeverNvidia encourages retaining the major CC and any minors in-between when using Feel free to close the PR if you prefer to avoid until there's a relevant bug report. My understanding is it should only be an issue when using a kernel from cuBLAS that would defer to For example in the current base image used to build, $cuobjdump --list-elf /usr/local/cuda/lib64/libcublas_static.a| grep -oE'\.sm_70.*\.'| wc -l184$cuobjdump --list-elf /usr/local/cuda/lib64/libcublas_static.a| grep -oE'\.sm_75.*\.'| wc -l8#Individual cubins:$cuobjdump --list-elf /usr/local/cuda/lib64/libcublas_static.a| grep -E'\.sm_75.*\.'ELF file 5: libcublas_static.5.sm_75.cubinELF file 13: libcublas_static.13.sm_75.cubinELF file 21: libcublas_static.21.sm_75.cubinELF file 29: libcublas_static.29.sm_75.cubinELF file 37: libcublas_static.37.sm_75.cubinELF file 45: libcublas_static.45.sm_75.cubinELF file 53: libcublas_static.53.sm_75.cubinELF file 61: libcublas_static.61.sm_75.cubin I'm not entirely sure why the minor CC versions in-between (when present) might matter to be retained. The concern does not apply to the other two supported real archs handled via text-embeddings-inference/Dockerfile-cuda Lines 57 to 60 in53eae1b
|
What does this PR do?
Pruning cuBLAS for CC 7.5 now also retains
sm_70
in addition to thesm_75
target. See#610 (comment) for more information.