Movatterモバイル変換


[0]ホーム

URL:


ContentsMenuExpandLight modeDark modeAuto light/dark, in light modeAuto light/dark, in dark modeSkip to content
onnx-array-api 0.3.4 documentation
Logo
onnx-array-api 0.3.4 documentation

Contents

More

Back to top

Note

Go to the endto download the full example code.

Optimization with onnxruntime

onnxruntime optimizes the onnx graph by default before runningthe inference. It modifies, fuses or add new operators.Some of them are standard onnx operators, some of themare implemented in onnxruntime (seeSupported Operators).This example looks into the differences of two models.

Optimize a model with onnxruntime

importosfrompprintimportpprintimportnumpyfrompandasimportDataFrameimportmatplotlib.pyplotaspltfromonnximportloadfromonnx_array_api.ext_test_caseimportexample_pathfromonnx_array_api.plotting.text_plotimportonnx_simple_text_plotfromonnx_array_api.validation.diffimporttext_diff,html_difffromonnxruntimeimportGraphOptimizationLevel,InferenceSession,SessionOptionsfromonnx_array_api.ext_test_caseimportmeasure_timefromonnx_array_api.ort.ort_optimizersimportort_optimized_modelfilename=example_path("data/small.onnx")optimized=filename+".optimized.onnx"ifnotos.path.exists(optimized):ort_optimized_model(filename,output=optimized)print(optimized)
data/small.onnx.optimized.onnx

Output comparison

so=SessionOptions()so.graph_optimization_level=GraphOptimizationLevel.ORT_ENABLE_ALLimg=numpy.random.random((1,3,112,112)).astype(numpy.float32)sess=InferenceSession(filename,so,providers=["CPUExecutionProvider"])sess_opt=InferenceSession(optimized,so,providers=["CPUExecutionProvider"])input_name=sess.get_inputs()[0].nameout=sess.run(None,{input_name:img})[0]out_opt=sess_opt.run(None,{input_name:img})[0]ifout.shape!=out_opt.shape:print("ERROR shape are different{out.shape} !={out_opt.shape}")diff=numpy.abs(out-out_opt).max()print(f"Differences:{diff}")
Differences: 0.0

Difference

Unoptimized model.

withopen(filename,"rb")asf:model=load(f)print("first model to text...")text1=onnx_simple_text_plot(model,indent=False)print(text1)
first model to text...opset: domain='' version=11input: name='input' type=dtype('float32') shape=['None', 3, 112, 112]init: name='i0' type=float32 shape=(64,)init: name='i1' type=float32 shape=(64,)init: name='i2' type=float32 shape=(64,)init: name='i3' type=float32 shape=(64,)init: name='i4' type=float32 shape=(1, 2, 7, 7)init: name='i5' type=float32 shape=(64, 3, 3, 3)init: name='i6' type=float32 shape=(64,)init: name='i7' type=float32 shape=(64, 64, 3, 3)init: name='i8' type=float32 shape=(64,)init: name='i9' type=float32 shape=(64, 64, 3, 3)init: name='i10' type=float32 shape=(64,)init: name='i11' type=float32 shape=(64, 64, 1, 1)init: name='i12' type=float32 shape=(64,)init: name='i13' type=float32 shape=(64, 1, 1)init: name='i14' type=float32 shape=(64, 1, 1)Conv(input, i5, i6, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[1,1]) -> r0PRelu(r0, i13) -> r1ReduceMean(r1, axes=[1], keepdims=1) -> r2ReduceMax(r1, axes=[1], keepdims=1) -> r3Concat(r2, r3, axis=1) -> r4Conv(r4, i4, dilations=[1,1], group=1, kernel_shape=[7,7], pads=[3,3,3,3], strides=[1,1]) -> r5Sigmoid(r5) -> r6Mul(r6, r1) -> r7BatchNormalization(r7, i0, i1, i2, i3, epsilon=0.00, momentum=0.90) -> r8Conv(r8, i7, i8, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[1,1]) -> r9PRelu(r9, i14) -> r10Conv(r10, i9, i10, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[2,2]) -> r11Conv(r7, i11, i12, dilations=[1,1], group=1, kernel_shape=[1,1], pads=[0,0,0,0], strides=[2,2]) -> r12Add(r11, r12) -> onnx::BatchNormalization_1830output: name='onnx::BatchNormalization_1830' type=dtype('float32') shape=['None', 64, 56, 56]

Optimized model.

withopen(optimized,"rb")asf:model=load(f)print("second model to text...")text2=onnx_simple_text_plot(model,indent=False)print(text2)
second model to text...opset: domain='' version=11opset: domain='ai.onnx.ml' version=5opset: domain='ai.onnx.training' version=1opset: domain='ai.onnx.preview.training' version=1opset: domain='com.microsoft' version=1opset: domain='com.microsoft.experimental' version=1opset: domain='com.microsoft.nchwc' version=1opset: domain='org.pytorch.aten' version=1input: name='input' type=dtype('float32') shape=['None', 3, 112, 112]init: name='i0' type=float32 shape=(64,)init: name='i1' type=float32 shape=(64,)init: name='i2' type=float32 shape=(64,)init: name='i3' type=float32 shape=(64,)init: name='reorder_token_10' type=float32 shape=(64, 64, 3, 3)init: name='reorder_token_6' type=float32 shape=(64, 64, 3, 3)init: name='i6' type=float32 shape=(64,)init: name='reorder_token_1' type=float32 shape=(8, 2, 7, 7)init: name='i8' type=float32 shape=(64,)init: name='reorder' type=float32 shape=(64, 3, 3, 3)init: name='i10' type=float32 shape=(64,)init: name='reorder_token_3' type=float32 shape=(64, 64, 1, 1)init: name='i12' type=float32 shape=(64,)init: name='i13' type=float32 shape=(64, 1, 1)init: name='i14' type=float32 shape=(64, 1, 1)Conv[com.microsoft.nchwc](input, reorder, i6, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_0ReorderOutput[com.microsoft.nchwc](reorder_token_0, channels_last=0, channels=64) -> r0PRelu(r0, i13) -> r1ReduceMax(r1, keepdims=1, axes=[1]) -> r3ReduceMean(r1, keepdims=1, axes=[1]) -> r2Concat(r2, r3, axis=1) -> r4Conv[com.microsoft.nchwc](r4, reorder_token_1, activation=b'Sigmoid', auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[7,7], pads=[3,3,3,3]) -> reorder_token_2ReorderOutput[com.microsoft.nchwc](reorder_token_2, channels_last=0, channels=1) -> r6Mul(r6, r1) -> r7BatchNormalization(r7, i0, i1, i2, i3, momentum=0.90, epsilon=0.00) -> r8ReorderInput[com.microsoft.nchwc](r8, channels_last=0) -> reorder_token_7Conv[com.microsoft.nchwc](reorder_token_7, reorder_token_6, i8, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_9ReorderOutput[com.microsoft.nchwc](reorder_token_9, channels_last=0, channels=64) -> r9PRelu(r9, i14) -> r10ReorderInput[com.microsoft.nchwc](r10, channels_last=0) -> reorder_token_11ReorderInput[com.microsoft.nchwc](r7, channels_last=0) -> reorder_token_4Conv[com.microsoft.nchwc](reorder_token_4, reorder_token_3, i12, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[2,2], kernel_shape=[1,1], pads=[0,0,0,0]) -> reorder_token_5Conv[com.microsoft.nchwc](reorder_token_11, reorder_token_10, i10, reorder_token_5, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[2,2], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_13ReorderOutput[com.microsoft.nchwc](reorder_token_13, channels_last=0, channels=64) -> onnx::BatchNormalization_1830output: name='onnx::BatchNormalization_1830' type=dtype('float32') shape=['None', 64, 56, 56]

Differences

print("differences...")print(text_diff(text1,text2))
differences...  opset: domain='' version=11+ opset: domain='ai.onnx.ml' version=5+ opset: domain='ai.onnx.training' version=1+ opset: domain='ai.onnx.preview.training' version=1+ opset: domain='com.microsoft' version=1+ opset: domain='com.microsoft.experimental' version=1+ opset: domain='com.microsoft.nchwc' version=1+ opset: domain='org.pytorch.aten' version=1  input: name='input' type=dtype('float32') shape=['None', 3, 112, 112]  init: name='i0' type=float32 shape=(64,)  init: name='i1' type=float32 shape=(64,)  init: name='i2' type=float32 shape=(64,)  init: name='i3' type=float32 shape=(64,)- init: name='i4' type=float32 shape=(1, 2, 7, 7)?             ^^                      ^  ^  ^  ^+ init: name='reorder_token_10' type=float32 shape=(64, 64, 3, 3)?             ^^^^^^^^^^^^^^^^                      ^^  ^^  ^  ^- init: name='i5' type=float32 shape=(64, 3, 3, 3)?             ^^                          ^+ init: name='reorder_token_6' type=float32 shape=(64, 64, 3, 3)?             ^^^^^^^^^^^^^^^                          ^^  init: name='i6' type=float32 shape=(64,)- init: name='i7' type=float32 shape=(64, 64, 3, 3)?             ^^                      ^^  ^^  ^  ^+ init: name='reorder_token_1' type=float32 shape=(8, 2, 7, 7)?             ^^^^^^^^^^^^^^^                      ^  ^  ^  ^  init: name='i8' type=float32 shape=(64,)- init: name='i9' type=float32 shape=(64, 64, 3, 3)?             ^^                          ^^+ init: name='reorder' type=float32 shape=(64, 3, 3, 3)?             ^^^^^^^                          ^  init: name='i10' type=float32 shape=(64,)- init: name='i11' type=float32 shape=(64, 64, 1, 1)?             ^^^+ init: name='reorder_token_3' type=float32 shape=(64, 64, 1, 1)?             ^^^^^^^^^^^^^^^  init: name='i12' type=float32 shape=(64,)  init: name='i13' type=float32 shape=(64, 1, 1)  init: name='i14' type=float32 shape=(64, 1, 1)- Conv(input, i5, i6, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[1,1]) -> r0+ Conv[com.microsoft.nchwc](input, reorder, i6, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_0+ ReorderOutput[com.microsoft.nchwc](reorder_token_0, channels_last=0, channels=64) -> r0  PRelu(r0, i13) -> r1+ ReduceMax(r1, keepdims=1, axes=[1]) -> r3- ReduceMean(r1, axes=[1], keepdims=1) -> r2?                ----------+ ReduceMean(r1, keepdims=1, axes=[1]) -> r2?                          ++++++++++- ReduceMax(r1, axes=[1], keepdims=1) -> r3  Concat(r2, r3, axis=1) -> r4- Conv(r4, i4, dilations=[1,1], group=1, kernel_shape=[7,7], pads=[3,3,3,3], strides=[1,1]) -> r5- Sigmoid(r5) -> r6+ Conv[com.microsoft.nchwc](r4, reorder_token_1, activation=b'Sigmoid', auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[7,7], pads=[3,3,3,3]) -> reorder_token_2+ ReorderOutput[com.microsoft.nchwc](reorder_token_2, channels_last=0, channels=1) -> r6  Mul(r6, r1) -> r7- BatchNormalization(r7, i0, i1, i2, i3, epsilon=0.00, momentum=0.90) -> r8?                                        --------------+ BatchNormalization(r7, i0, i1, i2, i3, momentum=0.90, epsilon=0.00) -> r8?                                                     ++++++++++++++- Conv(r8, i7, i8, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[1,1]) -> r9+ ReorderInput[com.microsoft.nchwc](r8, channels_last=0) -> reorder_token_7+ Conv[com.microsoft.nchwc](reorder_token_7, reorder_token_6, i8, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_9+ ReorderOutput[com.microsoft.nchwc](reorder_token_9, channels_last=0, channels=64) -> r9  PRelu(r9, i14) -> r10- Conv(r10, i9, i10, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[2,2]) -> r11- Conv(r7, i11, i12, dilations=[1,1], group=1, kernel_shape=[1,1], pads=[0,0,0,0], strides=[2,2]) -> r12- Add(r11, r12) -> onnx::BatchNormalization_1830+ ReorderInput[com.microsoft.nchwc](r10, channels_last=0) -> reorder_token_11+ ReorderInput[com.microsoft.nchwc](r7, channels_last=0) -> reorder_token_4+ Conv[com.microsoft.nchwc](reorder_token_4, reorder_token_3, i12, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[2,2], kernel_shape=[1,1], pads=[0,0,0,0]) -> reorder_token_5+ Conv[com.microsoft.nchwc](reorder_token_11, reorder_token_10, i10, reorder_token_5, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[2,2], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_13+ ReorderOutput[com.microsoft.nchwc](reorder_token_13, channels_last=0, channels=64) -> onnx::BatchNormalization_1830  output: name='onnx::BatchNormalization_1830' type=dtype('float32') shape=['None', 64, 56, 56]

HTML version.

print("html differences...")output=html_diff(text1,text2)withopen("diff_html.html","w",encoding="utf-8")asf:f.write(output)print("done.")
html differences...done.

Benchmark

img=numpy.random.random((1,3,112,112)).astype(numpy.float32)t1=measure_time(lambda:sess.run(None,{input_name:img}),repeat=25,number=25)t1["name"]="original"print("Original model")pprint(t1)t2=measure_time(lambda:sess_opt.run(None,{input_name:img}),repeat=25,number=25)t2["name"]="optimized"print("Optimized")pprint(t2)
Original model{'average': np.float64(0.0056790061488049106), 'context_size': 64, 'deviation': np.float64(0.0009349826756959479), 'max_exec': np.float64(0.008692911559919594), 'min_exec': np.float64(0.004763298119942192), 'name': 'original', 'number': 25, 'repeat': 25, 'ttime': np.float64(0.14197515372012276)}Optimized{'average': np.float64(0.005954185577592579), 'context_size': 64, 'deviation': np.float64(0.0014574969062307263), 'max_exec': np.float64(0.0113777117599966), 'min_exec': np.float64(0.0048157119199458975), 'name': 'optimized', 'number': 25, 'repeat': 25, 'ttime': np.float64(0.14885463943981447)}

Plots

fig,ax=plt.subplots(1,1,figsize=(12,4))df=DataFrame([t1,t2]).set_index("name")df
plot optimization
averagedeviationmin_execmax_execrepeatnumberttimecontext_size
name
original0.0056790.0009350.0047630.00869325250.14197564
optimized0.0059540.0014570.0048160.01137825250.14885564


And the graph is:

ax.bar(df.index,df["average"].values,yerr=df["deviation"].values,capsize=6)ax.set_title("Measure performance of optimized model\nlower is better")plt.grid()fig.savefig("plot_optimization.png")
plot optimization

Total running time of the script: (0 minutes 7.619 seconds)

Gallery generated by Sphinx-Gallery

On this page

[8]ページ先頭

©2009-2025 Movatter.jp