Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[WIP] IRON host runtime abstraction#2737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
hunhoffe wants to merge21 commits intomain
base:main
Choose a base branch
Loading
fromiron_runtime
Draft

Conversation

@hunhoffe
Copy link
Collaborator

@hunhoffehunhoffe commentedNov 25, 2025
edited
Loading

This PR begins the process of taking the work others (mostly@pvasireddy-amd and@andrej , I think) did to create a class that manages XRT kernels with a cache of pre-loaded kernels in amd/IRON (https://github.com/amd/IRON/blob/devel/applications/llama_3.2_1b/src/aie_device_manager.py).

The primary goals of this work are to:

  1. Deduplicate logic between the JIT runtime code (e.g., the NPUKernel class in IRON), the XRT helper code (e.g., theAIE_Application class in theaie.utils.xrt modulel) and the amd/IRON runtime code (e..g, the AIEDeviceManager class inhttps://github.com/amd/IRON/blob/devel/applications/llama_3.2_1b/src/aie_device_manager.py). This will help maintainability and ensure improvements made for efficiency (e.g., in pre-loading, buffer handling, etc.) are consistently used.
  2. Continue abstracting specifics of XRT away from the conceptual use of a runtime. This is a follow on to a previous PR or two oniron.Tensors.

What this PR does not do:

  • I do not see this PR introducing a fully complete solution runtime management; this is an incremental step towards consolidating different code bases with different functionalities; I anticipate further fine-tuning after this PR.

Note: this PR is not yet ready for review.

@hunhoffehunhoffe changed the title[WIP] IRON library runtime object[WIP] IRON host runtime abstractionNov 25, 2025
@fifield
Copy link
Collaborator

This is great. I'm wondering about the layering. Could non-IRON mlir-aie users like mlir-air use this without importing all of iron?

Comment on lines +40 to +59
if any(
keyword in device_type_str
for keyword in [
"NPU Strix",
"NPU Strix Halo",
"NPU Krackan",
"RyzenAI-npu4",
"RyzenAI-npu6",
]
):
self._device_type = NPU2
elif any(
keyword in device_type_str
for keyword in [
"NPU",
"NPU Phoenix",
"RyzenAI-npu1",
]
):
self._device_type = NPU1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Maybe have this as a stand-alone function?

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I believe this function will go away before this PR is finished because this is information we can fetch from the compiler about the device/targetmodel.

I think the enum AIEArch is the key, but I haven't fully finished the implementation of this yet. Thoughts@fifield ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I believe this function will go away before this PR is finished because this is information we can fetch from the compiler about the device/targetmodel.

I think the enum AIEArch is the key, but I haven't fully finished the implementation of this yet. Thoughts@fifield ?

You could certainly useAIEArch orAIEDevice enums here. I don't have much else to add here other than this basic logic is already is in iron device.py and in aie_lit_utils. Hopefully the string names of things are stable by now, but the fewer places it's parsed the better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

How does the compiler know the device, esp. if there is no device present?

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

At the level of an iron program, you can instantiate a device object in Python based on the aie C++ target model code and use it for generating the MLIR. My current understanding is that the device type actually needs to be present for the MLIR generation, not just the compilation process; aiecc just parses the device name from the mlir-aie device operation. So, for the compiler, the device type is embedded in the input file and not something parsed from the system.

I haven't fully untangled how I want to set defaults in the JIT infrastructure yet as this is just a work in progress and I am not yet introducing a "Compilable" or "Runnable" abstractions yet (which will eventually fit beneath the JIT frontend).

I'm open to suggestions!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

OK, that makes sense.

device_type_str = self._device.get_info(pyxrt.xrt_info_device.name)

# Fetch the device type by matching strings for NPU2 or NPU1
# TODO: how to use only a portion of the device rather than whole array?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You could return a tuple of device and max columns to let users know what is the limit.

Copy link
CollaboratorAuthor

@hunhoffehunhoffeNov 25, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

In my mind, you can ask the device object returned byruntime.device() for things like columns and rows, e.g.,

defcols(self)->int:

So maybe this doesn't need to be a TODO. Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Do I need to instantiate XRT to ask that?

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The device target model is defined in the aiecc compiler code, I believe. I do not think it is dependent on XRT, although I have never tried it -- if you have a different experience, let me know!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not sure. Right now, I have hardcoded the number of columns based on the device type but I'd prefer if IRON has that knowledge.

Copy link
Collaborator

@fifieldfifieldNov 26, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The target model bindings only depend on the dialect bindings, not xrt.

>>>fromaie.dialects.aieimportget_target_model,AIEDevice>>>get_target_model(AIEDevice.npu1).columns()4>>>get_target_model(AIEDevice.npu1_1col).columns()1>>>

@hunhoffe
Copy link
CollaboratorAuthor

@fifield good question. I think it's tied to IRON unless we moved some of this logic to pyxrt. However, installing the mlir-aie wheels is maybe not too bad anymore?

fifield reacted with thumbs up emoji

@hunhoffe
Copy link
CollaboratorAuthor

If I don't want to accidentally destroy the tracing utils with my consolidation, I think I need to do this PR first:#2743

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@fifieldfifieldfifield left review comments

@ypapadop-amdypapadop-amdypapadop-amd left review comments

@stephenneuendorfferstephenneuendorfferAwaiting requested review from stephenneuendorfferstephenneuendorffer will be requested when the pull request is marked ready for reviewstephenneuendorffer is a code owner

@jgmelberjgmelberAwaiting requested review from jgmelberjgmelber will be requested when the pull request is marked ready for reviewjgmelber is a code owner

@jackl-xilinxjackl-xilinxAwaiting requested review from jackl-xilinxjackl-xilinx will be requested when the pull request is marked ready for reviewjackl-xilinx is a code owner

@AndraBiscaAndraBiscaAwaiting requested review from AndraBiscaAndraBisca will be requested when the pull request is marked ready for reviewAndraBisca is a code owner

@andrejandrejAwaiting requested review from andrejandrej will be requested when the pull request is marked ready for reviewandrej is a code owner

@pvasireddy-amdpvasireddy-amdAwaiting requested review from pvasireddy-amdpvasireddy-amd will be requested when the pull request is marked ready for reviewpvasireddy-amd is a code owner

@denolfdenolfAwaiting requested review from denolfdenolf will be requested when the pull request is marked ready for reviewdenolf is a code owner

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

4 participants

@hunhoffe@fifield@ypapadop-amd

[8]ページ先頭

©2009-2025 Movatter.jp