"passes": {"conversion": {"device":"cpu","type":"OnnxConversion","target_opset":17,"use_dynamo_exporter":false    },"to_fixed_shape": {"type":"DynamicToFixedShape","dim_param": ["batch_size","sequence_length"],"dim_value": [1,77]    },"quantization": {"type":"QuarkQuantization","data_config":"calib_data","config_template":"XINT8","enable_npu_transformer":true,"extra_options": {"OpTypesToExcludeOutputQuantization": ["MatMul","Gemm"],"ActivationSymmetric":true      },"debug_mode":true,"log_severity_level":0,"ignore_warnings":false    }  }

Please refer tohttps://quark.docs.amd.com/latest/onnx/user_guide_config_description.html for the complete list ofconfig_template options. All the other quantization options are listed inhttps://quark.docs.amd.com/latest/onnx/appendix_full_quant_config_features.html .

Examples

2 ResNet examples are added toexamples/vai, which convert the models using Quark then evaluate onVitisAIExecutionProvider (run on NPU, RyzenAI 1.3.1, onnxruntime-vitisai 1.19).

Checklist before requesting a review

Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by runninglintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to updateexample documentation in a follow-up PR.

(Optional) Issue link

Yi Ren added2 commits

March 27, 2025 11:16

replace deprecated enable_dpu with enable_npu_cnn and enable_npu_tran…

56b873d

…sformer for VAI

use quark in VitisAIQuantization; expose params

469e225

github-advanced-securitybot found potential problems

Mar 27, 2025

View reviewed changes

olive/passes/onnx/vitis_ai_quantization.py FixedShow fixedHide fixed

configure VitisAIExecutionProvider

1646eb7

vortex-captain marked this pull request as ready for review

March 27, 2025 05:56

github-advanced-securitybot found potential problems

Mar 27, 2025

View reviewed changes

Copy link

Contributor

github-advanced-securitybot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

lintrunner found more than 20 potential problems in the proposed changes. Check theFiles changed tab for more details.

fix linter issues

32f22ae

jambayk reviewed

Mar 27, 2025

View reviewed changes

olive/passes/onnx/vitis_ai_quantization.pyShow resolvedHide resolved

jambayk reviewed

Mar 27, 2025

View reviewed changes

olive/common/ort_inference.pyShow resolvedHide resolved

jambayk reviewed

Mar 27, 2025

View reviewed changes

olive/common/ort_inference.py OutdatedShow resolvedHide resolved

jambayk reviewed

Mar 27, 2025

View reviewed changes

olive/common/ort_inference.py OutdatedShow resolvedHide resolved

use pathlib.Path; fix linter issues

e5cb11d

github-advanced-securitybot found potential problems

Mar 28, 2025

View reviewed changes

olive/common/ort_inference.py FixedShow fixedHide fixed

Yi Ren added3 commits

March 28, 2025 10:41

remove old vitis_ai code

c569355

use Olive cache for VitisAI EP

c4d1543

fix linter

9c36257

github-advanced-securitybot found potential problems

Mar 28, 2025

View reviewed changes

olive/common/ort_inference.py FixedShow fixedHide fixed

fix linter

9844278

vortex-captain requested a review fromjambayk

March 28, 2025 03:55

Copy link

Contributor

ChaoLi-AMD commentedApr 1, 2025

Describe your changes
Example usage in Olive workflow json:
"passes": {"conversion": {"device":"cpu","type":"OnnxConversion","target_opset":17,"use_dynamo_exporter":false    },"quantization": {"type":"VitisAIQuantization","data_config":"calib_data","config_template":"INT8_TRANSFORMER_ACCURATE","extra_options": {"OpTypesToExcludeOutputQuantization": ["MatMul","Gemm"],"ActivationSymmetric":true      },"debug_mode":true,"log_severity_level":0,"ignore_warnings":false    }  }
Checklist before requesting a review
Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by runninglintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to updateexample documentation in a follow-up PR.
(Optional) Issue link

please improve the example to refer to Quark documentation:https://quark.docs.amd.com/latest/supported_accelerators/ryzenai/index.html

ChaoLi-AMD reviewed

Apr 1, 2025

View reviewed changes

olive/passes/onnx/vitis_ai_quantization.pyShow resolvedHide resolved

Copy link

Contributor

xiaoyu-work commentedApr 1, 2025

/azp run

Copy link

azure-pipelinesbot commentedApr 1, 2025

Azure Pipelines successfully started running 1 pipeline(s).

xiaoyu-work reviewed

Apr 7, 2025

View reviewed changes

olive/passes/onnx/vitis_ai_quantization.py OutdatedShow resolvedHide resolved

xiaoyu-work reviewed

Apr 7, 2025

View reviewed changes

olive/cache.py Outdated

		@@ -40,6 +40,7 @@ class CacheSubDirs:
		evaluations: Path
		resources: Path
		mlflow: Path
		vitis_ai: Path

Copy link

Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can you explain more about how will you use this folder? The cache folder is designed to be pass-agnostic so i want to double confirm the use case here.

Copy link

Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The folder will be created at the beginning of the evaluation step, upon the creation of a VitisAIExecutionProvider inference session (used as model cache by EP). Is evaluation considered an Olive pass?

Copy link

Contributor

xiaoyu-workApr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

No, if VitisAIEP will need to cache a model for evaluation, can we create a temporal folder for it? (and it will be deleted after all. I assume this model cache is not needed when the workflow finish.). We can create a temporary folder in cache.evaluations like temp_model_cache or something.

Copy link

Contributor

shaahji commentedApr 9, 2025

Update the entry inolive_config.json to point to the correct location of the pass implementation in the module. Many of the tests are failing because of the wrong entry.

Yi Ren added3 commits

April 17, 2025 13:14

Merge branch 'main' into reny/add_quark

36c5ea9

use optional as input types

b7fc958

vai_q_onnx -> quark

c426989

Copy link

Author

vortex-captain commentedApr 17, 2025

Describe your changes
Example usage in Olive workflow json:
"passes": {"conversion": {"device":"cpu","type":"OnnxConversion","target_opset":17,"use_dynamo_exporter":false    },"quantization": {"type":"VitisAIQuantization","data_config":"calib_data","config_template":"INT8_TRANSFORMER_ACCURATE","extra_options": {"OpTypesToExcludeOutputQuantization": ["MatMul","Gemm"],"ActivationSymmetric":true      },"debug_mode":true,"log_severity_level":0,"ignore_warnings":false    }  }
Checklist before requesting a review
Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by runninglintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to updateexample documentation in a follow-up PR.
(Optional) Issue link
please improve the example to refer to Quark documentation:https://quark.docs.amd.com/latest/supported_accelerators/ryzenai/index.html

Added links of Quark documentation on quantization configurations

vortex-captain requested review fromshaahji andChaoLi-AMD and removed request forChaoLi-AMD

April 17, 2025 07:31

ChaoLi-AMD reviewed

Apr 17, 2025

View reviewed changes

olive/passes/onnx/vitis_ai_quantization.py OutdatedShow resolvedHide resolved

Copy link

Contributor

ChaoLi-AMD commentedApr 17, 2025

Describe your changes
Example usage in Olive workflow json:
"passes": {"conversion": {"device":"cpu","type":"OnnxConversion","target_opset":17,"use_dynamo_exporter":false    },"quantization": {"type":"VitisAIQuantization","data_config":"calib_data","config_template":"INT8_TRANSFORMER_ACCURATE","extra_options": {"OpTypesToExcludeOutputQuantization": ["MatMul","Gemm"],"ActivationSymmetric":true      },"debug_mode":true,"log_severity_level":0,"ignore_warnings":false    }  }
Checklist before requesting a review
Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by runninglintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to updateexample documentation in a follow-up PR.
(Optional) Issue link
please improve the example to refer to Quark documentation:https://quark.docs.amd.com/latest/supported_accelerators/ryzenai/index.html
Added links of Quark documentation on quantization configurations

For a Ryzen AI example, please use XINT8 as the example instead of INT8_TRANSFORMER_ACCURATE. Just checking — does this example currently runnable on Olive?

jambayk reviewed

Apr 17, 2025

View reviewed changes

olive/passes/onnx/vitis_ai/__init__.py OutdatedShow resolvedHide resolved

Copy link

Contributor

jambayk commentedApr 17, 2025

@vortex-captain please create a copy of your branch directly in this repo and open a new PR to be able to run the CI without the login issue.

xiaoyu-work mentioned this pull request

Apr 17, 2025

Copy of #1715#1763

Closed

6 tasks

remove examples/resnet/resnet_vitis_ai_ptq_cpu.json

40ad7d5

Copy link

Author

vortex-captain commentedApr 18, 2025

Describe your changes
Example usage in Olive workflow json:
"passes": {"conversion": {"device":"cpu","type":"OnnxConversion","target_opset":17,"use_dynamo_exporter":false    },"quantization": {"type":"VitisAIQuantization","data_config":"calib_data","config_template":"INT8_TRANSFORMER_ACCURATE","extra_options": {"OpTypesToExcludeOutputQuantization": ["MatMul","Gemm"],"ActivationSymmetric":true      },"debug_mode":true,"log_severity_level":0,"ignore_warnings":false    }  }
Checklist before requesting a review
Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by runninglintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to updateexample documentation in a follow-up PR.
(Optional) Issue link
please improve the example to refer to Quark documentation:https://quark.docs.amd.com/latest/supported_accelerators/ryzenai/index.html
Added links of Quark documentation on quantization configurations
For a Ryzen AI example, please use XINT8 as the example instead of INT8_TRANSFORMER_ACCURATE. Just checking — does this example currently runnable on Olive?

Updated example in description. And yes, such an example (BERT text model) is runnable on Olive, but in evaluation, the output model cannot run on NPU (all nodes assigned to CPU), unlike the ResNet examples. Any insights?

Yi Ren added3 commits

April 22, 2025 17:55

vitisai -> Quark

498b59a

rename jsons

7c62a0c

Merge branch 'main' into reny/add_quark

6645663

github-advanced-securitybot found potential problems

Apr 22, 2025

View reviewed changes

olive/passes/onnx/quark_quantization.py FixedShow fixedHide fixed

github-advanced-securitybot found potential problems

Apr 22, 2025

View reviewed changes

examples/vai/image.py FixedShow fixedHide fixed

examples/vai/ms_resnet_50_vitis_ai_ptq_npu.json FixedShow fixedHide fixed

olive/passes/onnx/quark_quantization.py FixedShow fixedHide fixed

ChaoLi-AMD reviewed

Apr 22, 2025

View reviewed changes

docs/source/features/quantization.md OutdatedShow resolvedHide resolved

ChaoLi-AMD reviewed

Apr 22, 2025

View reviewed changes

docs/source/reference/options.md OutdatedShow resolvedHide resolved

gengxinwu reviewed

Apr 23, 2025

View reviewed changes

docs/source/features/quantization.mdShow resolvedHide resolved

examples/resnet/resnet_vitis_ai_ptq_cpu.json OutdatedShow resolvedHide resolved

examples/vai/image.py OutdatedShow resolvedHide resolved

olive/passes/onnx/quark_quantization.pyShow resolvedHide resolved

chinazhangchaoand others added2 commits

April 27, 2025 10:20

Merge branch 'microsoft:main' into reny/add_quark

40541c8

fix lint

d1ebcca

github-advanced-securitybot found potential problems

Apr 27, 2025

View reviewed changes

examples/vai/image.py FixedShow fixedHide fixed

olive/passes/onnx/quark_quantization.py FixedShow fixedHide fixed

fix comments

42c8620

github-advanced-securitybot found potential problems

Apr 27, 2025

View reviewed changes

examples/resnet/image.py FixedShow fixedHide fixed

examples/resnet/resnet_50_vitis_ai_ptq_npu.json FixedShow fixedHide fixed

fix comments

9763389

github-advanced-securitybot found potential problems

Apr 27, 2025

View reviewed changes

examples/resnet/resnet_vitis_ai_ptq_npu.json FixedShow fixedHide fixed

chinazhangchaoand others added4 commits

April 27, 2025 17:49

fix cache folder comments

042c553

remove model cache

a80f9e1

fix lint

0a4b9ab

remove ununsed test

1b5d4d9

VishalX suggested changes

May 7, 2025

View reviewed changes

olive/common/ort_inference.py

Comment on lines +157 to +163

		elif provider == "VitisAIExecutionProvider":
		import os

		apu_type = get_vai_apu_type()
		set_vai_environment_variable(apu_type)
		install_dir = Path(os.environ["RYZEN_AI_INSTALLATION_PATH"])
		provider_options[idx]["config_file"] = str(install_dir / "voe-4.0-win_amd64" / "vaip_config.json")