- Notifications
You must be signed in to change notification settings - Fork202
Commit6abded4
authored
[5455919] Fix Q/DQ/Cast placement in 'FP32 required' custom ops (#554)
## What does this PR do?**Type of change:** Bug fix**Overview:** Fix incorrect quantization of custom ops when some inputtensors are required to be in INT8 and some in FP32.| Before fix | After fix ||----------------|-------------|| <img width="841" height="623" alt="snap_custom_op_quant_incorrect"src="https://github.com/user-attachments/assets/88e4d460-fbae-4bcb-86c8-139d23ce04c8"/> | <img width="786" height="286" alt="snap_custom_op_quant_correct"src="https://github.com/user-attachments/assets/475079c2-a565-4f0d-b167-6d801ab83dfc"/> |## Usage```python$ python -m modelopt.onnx.quantization --onnx_path=$MODEL_PATH.onnx \ --trt_plugins $PLUGIN_PATH.so \ --trt_plugins_precision $CUSTOM_OP_NAME:$PRECISION```## Testing### 1. BEVFormer model- Follow step 1 in[README](https://github.com/NVIDIA/DL4AGX/tree/master/AV-Solutions/bevformer-int8-eq#1-export-model-to-onnx-and-compile-plugins).- In the quantization step, do:```sh$ python -m modelopt.onnx.quantization --onnx_path=/mnt/models/bevformer_tiny_epoch_24_cp2_op13.onnx \ --trt_plugins=$PLUGIN_PATH \ --trt_plugins_precision MultiScaleDeformableAttnTRT:[int8,int32,fp32,int8,int8]:[int8] \ --high_precision_dtype fp16```> See table in "Overview" for expected graph structure.### 2. 5455919 modelValidated model in bug 5455919.## Before your PR is "*Ready for review*"<!-- If you haven't finished some of the above items you can still open`Draft` PR. -->- **Make sure you read and follow [Contributorguidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**and your commits are signed.- **Is this change backward compatible?**: Yes- **Did you write any new necessary tests?**: No- **Did you add or update any necessary documentation?**: No- **Did you update[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:Yes## Additional Information-/pull/363: Feature expansion.-/pull/524: The graph cleanup isactually needed after Q/DQ trimming around custom ops. Moved the cleanuplines to inside that function.---------Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>1 parente20d218 commit6abded4
File tree
7 files changed
+77
-41
lines changed- modelopt/onnx
- autocast
- quantization
7 files changed
+77
-41
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| 30 | + | |
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
194 | 194 | | |
195 | 195 | | |
196 | 196 | | |
| 197 | + | |
197 | 198 | | |
198 | 199 | | |
199 | 200 | | |
| |||
204 | 205 | | |
205 | 206 | | |
206 | 207 | | |
207 | | - | |
208 | 208 | | |
| 209 | + | |
209 | 210 | | |
210 | 211 | | |
211 | 212 | | |
| |||
235 | 236 | | |
236 | 237 | | |
237 | 238 | | |
| 239 | + | |
238 | 240 | | |
239 | 241 | | |
240 | 242 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
99 | 99 | | |
100 | 100 | | |
101 | 101 | | |
| 102 | + | |
102 | 103 | | |
103 | 104 | | |
104 | 105 | | |
| |||
112 | 113 | | |
113 | 114 | | |
114 | 115 | | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
115 | 120 | | |
116 | 121 | | |
117 | 122 | | |
| |||
148 | 153 | | |
149 | 154 | | |
150 | 155 | | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
151 | 159 | | |
152 | 160 | | |
153 | 161 | | |
| |||
211 | 219 | | |
212 | 220 | | |
213 | 221 | | |
214 | | - | |
| 222 | + | |
| 223 | + | |
215 | 224 | | |
216 | 225 | | |
217 | 226 | | |
| |||
467 | 476 | | |
468 | 477 | | |
469 | 478 | | |
470 | | - | |
471 | | - | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
472 | 483 | | |
473 | 484 | | |
474 | 485 | | |
475 | | - | |
| 486 | + | |
476 | 487 | | |
477 | 488 | | |
478 | 489 | | |
| |||
481 | 492 | | |
482 | 493 | | |
483 | 494 | | |
484 | | - | |
| 495 | + | |
485 | 496 | | |
486 | 497 | | |
487 | 498 | | |
| |||
536 | 547 | | |
537 | 548 | | |
538 | 549 | | |
539 | | - | |
| 550 | + | |
540 | 551 | | |
541 | 552 | | |
542 | 553 | | |
| |||
888 | 899 | | |
889 | 900 | | |
890 | 901 | | |
891 | | - | |
| 902 | + | |
892 | 903 | | |
893 | 904 | | |
894 | 905 | | |
| |||
1272 | 1283 | | |
1273 | 1284 | | |
1274 | 1285 | | |
1275 | | - | |
1276 | | - | |
1277 | | - | |
1278 | | - | |
1279 | | - | |
1280 | | - | |
1281 | | - | |
| 1286 | + | |
| 1287 | + | |
| 1288 | + | |
1282 | 1289 | | |
1283 | 1290 | | |
1284 | 1291 | | |
| |||
1287 | 1294 | | |
1288 | 1295 | | |
1289 | 1296 | | |
1290 | | - | |
| 1297 | + | |
| 1298 | + | |
| 1299 | + | |
| 1300 | + | |
| 1301 | + | |
| 1302 | + | |
| 1303 | + | |
| 1304 | + | |
| 1305 | + | |
| 1306 | + | |
| 1307 | + | |
| 1308 | + | |
| 1309 | + | |
| 1310 | + | |
| 1311 | + | |
| 1312 | + | |
1291 | 1313 | | |
1292 | 1314 | | |
1293 | 1315 | | |
1294 | 1316 | | |
1295 | 1317 | | |
1296 | 1318 | | |
1297 | | - | |
| 1319 | + | |
1298 | 1320 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
169 | 169 | | |
170 | 170 | | |
171 | 171 | | |
| 172 | + | |
172 | 173 | | |
173 | 174 | | |
174 | 175 | | |
| |||
324 | 325 | | |
325 | 326 | | |
326 | 327 | | |
| 328 | + | |
327 | 329 | | |
328 | 330 | | |
329 | 331 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
| 123 | + | |
123 | 124 | | |
124 | 125 | | |
125 | 126 | | |
| |||
285 | 286 | | |
286 | 287 | | |
287 | 288 | | |
| 289 | + | |
288 | 290 | | |
289 | 291 | | |
290 | 292 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
872 | 872 | | |
873 | 873 | | |
874 | 874 | | |
875 | | - | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
876 | 878 | | |
877 | 879 | | |
878 | | - | |
879 | | - | |
880 | | - | |
881 | | - | |
882 | | - | |
883 | | - | |
884 | | - | |
885 | | - | |
886 | | - | |
887 | | - | |
888 | | - | |
889 | | - | |
890 | | - | |
| 880 | + | |
| 881 | + | |
| 882 | + | |
| 883 | + | |
| 884 | + | |
| 885 | + | |
| 886 | + | |
| 887 | + | |
| 888 | + | |
| 889 | + | |
| 890 | + | |
| 891 | + | |
| 892 | + | |
| 893 | + | |
| 894 | + | |
| 895 | + | |
| 896 | + | |
| 897 | + | |
| 898 | + | |
| 899 | + | |
| 900 | + | |
891 | 901 | | |
892 | 902 | | |
893 | 903 | | |
| |||
944 | 954 | | |
945 | 955 | | |
946 | 956 | | |
| 957 | + | |
| 958 | + | |
| 959 | + | |
| 960 | + | |
| 961 | + | |
947 | 962 | | |
948 | 963 | | |
949 | 964 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
430 | 430 | | |
431 | 431 | | |
432 | 432 | | |
433 | | - | |
434 | | - | |
435 | | - | |
436 | | - | |
437 | | - | |
438 | | - | |
439 | | - | |
440 | | - | |
441 | | - | |
442 | | - | |
443 | 433 | | |
444 | 434 | | |
445 | 435 | | |
| |||
485 | 475 | | |
486 | 476 | | |
487 | 477 | | |
| 478 | + | |
488 | 479 | | |
489 | 480 | | |
490 | 481 | | |
| |||
0 commit comments
Comments
(0)