Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
/mobyPublic

Comments

Use CDI for GPU injection for AMD devices for --gpus#52048

Open
shiv-tyagi wants to merge 1 commit intomoby:masterfrom
shiv-tyagi:vendor-detection
Open

Use CDI for GPU injection for AMD devices for --gpus#52048
shiv-tyagi wants to merge 1 commit intomoby:masterfrom
shiv-tyagi:vendor-detection

Conversation

@shiv-tyagi
Copy link

@shiv-tyagishiv-tyagi commentedFeb 16, 2026
edited
Loading

Closes#49824

This PR enhances the functionality of the--gpus option for AMD GPUs by utilizing CDI (Container Device Interface) specs for device injection when available. It falls back to the existing vendor runtime-based injection if AMD CDI specs are not detected on the machine.

Related PR:containerd/containerd#12839 (Similar implementation forcontainerd/ctr)

- What I did
Added support for CDI-based GPU device injection through--gpus option for AMD devices.

- How I did it
Created a similar composite device driver like NVIDIA's which discovers if AMD's CDI specs are there on the system during registration and registers itself with appropriate updaters to handle the device request.

- How to verify it

  1. Built the binaries usingmake binary.
  2. Started the newly builtdockerd instance via./bundles/binary/dockerd.
  3. Usedamd-ctk cdi generate to install the CDI specs on the host.
  4. Test Injection: Randocker run --rm --gpus all rocm/rocm-terminal rocm-smi.
    • Verified that CDI-based GPU injection works as expected.
  5. Test (Fallback Path):
    • Deleted the CDI specs and restarted thedockerd process.
    • Retried using the runtime flag:docker run --rm --runtime="amd" --gpus all rocm/rocm-terminal rocm-smi.
    • Verified that the vendor runtime still works as a fallback when CDI specs are absent.
    • Checked the environment variable inside the container hasAMD_VISIBLE_DEVICES set when CDI specs are not there to verify that the fallback is working correctly.

I have also added unit tests for vendor discovery function.

- Human readable description for the release notes

The`--gpus` option now supports CDI-based injection for AMD GPUs.

- A picture of a cute animal (not mandatory but encouraged)

@github-actionsgithub-actionsbot added the area/daemonCore Engine labelFeb 16, 2026

// Try to detect AMD GPU vendor via CDI cache if cdiCache is available
if cdiCache != nil {
vendor, err := discoverGPUVendorFromCDI(cdiCache)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

One thing about this approach ... this only checks whether the cache includes AMD cdi devices at the point where the daemon is reloaded. In contrast to the other projects where we have added this functionlity, the cache here is started withAutoRefresh enabled meaning that the CDI spec directories are watched for changes to ensure that specs for new devices are detected.

With that in mind, the drivers that one wants to register would have to determine the vendor from the cache for every--gpus request and not only once at startup.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes, makes sense.

I have updated the logic to always discover vendor from CDI registry on the fly since the registry is auto refreshed. I have also verified it is working as expected by deleting the CDI files while the daemon is running and verifying that the vendor discovery fails in that case.

Thanks for the suggestion.

@vvolandvvoland added kind/enhancementEnhancements are not bugs or new features but can improve usability or performance. area/cdi labelsFeb 16, 2026
@vvolandvvoland added this to the29.3.0 milestoneFeb 16, 2026
Signed-off-by: Shiv Tyagi <Shiv.Tyagi@amd.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

1 more reviewer

@elezarelezarelezar left review comments

Reviewers whose approvals may not affect merge requirements

At least 1 approving review is required to merge this pull request.

Assignees

No one assigned

Labels

area/cdiarea/daemonCore Enginearea/testingkind/enhancementEnhancements are not bugs or new features but can improve usability or performance.

Projects

None yet

Milestone

29.3.0

Development

Successfully merging this pull request may close these issues.

re-implement--gpus flag using CDI (was "AMD GPU support")

3 participants

@shiv-tyagi@elezar@vvoland

[8]ページ先頭

©2009-2026 Movatter.jp