- Notifications
You must be signed in to change notification settings - Fork227
Llama3.1 8B example recipe for QNN, VitisAI and OpenVino#1927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
a2deeb5
tof412146
CompareThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Pull Request Overview
This pull request updates the Llama 3.1 8B example recipe by adding and modifying configuration files for QNN, VitisAI, and OpenVINO workflows and updating the README with corresponding usage instructions.
- Added a new JSON configuration for QNN-based optimization.
- Introduced a VitisAI configuration update with a new metadata pass.
- Provided an OpenVINO configuration file and updated usage instructions in the README.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
examples/llama3_1/qnn_config.json | New config for QNN system using various quantization passes. |
examples/llama3_1/qdq_config_vitis_ai.json | New config for VitisAI integration with additional metadata pass. |
examples/llama3_1/openvino/Llama-3.1-8B-Instruct_context_ov_dynamic_sym_bkp_int8_sym.json | New config for OpenVINO optimization targeting NPU devices. |
examples/llama3_1/README.md | Updated instructions and commands for various optimization modes. |
Comments suppressed due to low confidence (1)
examples/llama3_1/qdq_config_vitis_ai.json:53
- [nitpick] The key 'addmetadata' uses a different naming convention compared to the abbreviated key names used in other passes (e.g., 'q', 'g', 'cs'). Consider renaming it to maintain naming consistency.
"addmetadata": {
- [Quantize, Finetune and Optimize for CPU/CUDA](../getting_started/olive-awq-ft-llama.ipynb) | ||
- [QDQ Model with 4-bit Weights & 16-bit Activations](../phi3_5/README.md): | ||
- Run the workflow with `olive run --config qdq_config.json -m meta-llama/Llama-3.1-8B-Instruct -o models/llama3-qdq`. | ||
- [AMD NPU: Optimization and Quantization with for VitisAI](../phi3_5/README.md): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The description contains an extra 'for' after 'with'. Consider removing it to improve clarity.
-[AMD NPU: Optimization and Quantization withforVitisAI](../phi3_5/README.md): | |
-[AMD NPU: Optimization and Quantization with VitisAI](../phi3_5/README.md): |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
thanks for adding the example. Since the config is same as the phi3_5 qdq and the readme also points to it, this and the qnn_config.json are not needed. we are trying to reuse the same configs to minimize duplicate configs and easy updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Sure, I will remove it
Uh oh!
There was an error while loading.Please reload this page.
Update Llama 3.1 8B example recipe for QNN, VitisAI and OpenVino