- Notifications
You must be signed in to change notification settings - Fork161
[WIP] IRON host runtime abstraction#2737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
fifield commentedNov 25, 2025
This is great. I'm wondering about the layering. Could non-IRON mlir-aie users like mlir-air use this without importing all of iron? |
| if any( | ||
| keyword in device_type_str | ||
| for keyword in [ | ||
| "NPU Strix", | ||
| "NPU Strix Halo", | ||
| "NPU Krackan", | ||
| "RyzenAI-npu4", | ||
| "RyzenAI-npu6", | ||
| ] | ||
| ): | ||
| self._device_type = NPU2 | ||
| elif any( | ||
| keyword in device_type_str | ||
| for keyword in [ | ||
| "NPU", | ||
| "NPU Phoenix", | ||
| "RyzenAI-npu1", | ||
| ] | ||
| ): | ||
| self._device_type = NPU1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Maybe have this as a stand-alone function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I believe this function will go away before this PR is finished because this is information we can fetch from the compiler about the device/targetmodel.
I think the enum AIEArch is the key, but I haven't fully finished the implementation of this yet. Thoughts@fifield ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I believe this function will go away before this PR is finished because this is information we can fetch from the compiler about the device/targetmodel.
I think the enum AIEArch is the key, but I haven't fully finished the implementation of this yet. Thoughts@fifield ?
You could certainly useAIEArch orAIEDevice enums here. I don't have much else to add here other than this basic logic is already is in iron device.py and in aie_lit_utils. Hopefully the string names of things are stable by now, but the fewer places it's parsed the better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
How does the compiler know the device, esp. if there is no device present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
At the level of an iron program, you can instantiate a device object in Python based on the aie C++ target model code and use it for generating the MLIR. My current understanding is that the device type actually needs to be present for the MLIR generation, not just the compilation process; aiecc just parses the device name from the mlir-aie device operation. So, for the compiler, the device type is embedded in the input file and not something parsed from the system.
I haven't fully untangled how I want to set defaults in the JIT infrastructure yet as this is just a work in progress and I am not yet introducing a "Compilable" or "Runnable" abstractions yet (which will eventually fit beneath the JIT frontend).
I'm open to suggestions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
OK, that makes sense.
| device_type_str = self._device.get_info(pyxrt.xrt_info_device.name) | ||
| # Fetch the device type by matching strings for NPU2 or NPU1 | ||
| # TODO: how to use only a portion of the device rather than whole array? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
You could return a tuple of device and max columns to let users know what is the limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
In my mind, you can ask the device object returned byruntime.device() for things like columns and rows, e.g.,
mlir-aie/python/iron/device/device.py
Line 94 ina81f682
| defcols(self)->int: |
So maybe this doesn't need to be a TODO. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Do I need to instantiate XRT to ask that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The device target model is defined in the aiecc compiler code, I believe. I do not think it is dependent on XRT, although I have never tried it -- if you have a different experience, let me know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'm not sure. Right now, I have hardcoded the number of columns based on the device type but I'd prefer if IRON has that knowledge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The target model bindings only depend on the dialect bindings, not xrt.
>>>fromaie.dialects.aieimportget_target_model,AIEDevice>>>get_target_model(AIEDevice.npu1).columns()4>>>get_target_model(AIEDevice.npu1_1col).columns()1>>>
hunhoffe commentedNov 25, 2025
@fifield good question. I think it's tied to IRON unless we moved some of this logic to pyxrt. However, installing the mlir-aie wheels is maybe not too bad anymore? |
hunhoffe commentedNov 26, 2025
If I don't want to accidentally destroy the tracing utils with my consolidation, I think I need to do this PR first:#2743 |
Uh oh!
There was an error while loading.Please reload this page.
This PR begins the process of taking the work others (mostly@pvasireddy-amd and@andrej , I think) did to create a class that manages XRT kernels with a cache of pre-loaded kernels in amd/IRON (https://github.com/amd/IRON/blob/devel/applications/llama_3.2_1b/src/aie_device_manager.py).
The primary goals of this work are to:
AIE_Applicationclass in theaie.utils.xrtmodulel) and the amd/IRON runtime code (e..g, the AIEDeviceManager class inhttps://github.com/amd/IRON/blob/devel/applications/llama_3.2_1b/src/aie_device_manager.py). This will help maintainability and ensure improvements made for efficiency (e.g., in pre-loading, buffer handling, etc.) are consistently used.iron.Tensors.What this PR does not do:
Note: this PR is not yet ready for review.