Optimization with onnxruntime¶

onnxruntime optimizes the onnx graph by default before runningthe inference. It modifies, fuses or add new operators.Some of them are standard onnx operators, some of themare implemented in onnxruntime (seeSupported Operators).This example looks into the differences of two models.

Optimize a model with onnxruntime¶

importosfrompprintimportpprintimportnumpyfrompandasimportDataFrameimportmatplotlib.pyplotaspltfromonnximportloadfromonnx_array_api.ext_test_caseimportexample_pathfromonnx_array_api.plotting.text_plotimportonnx_simple_text_plotfromonnx_array_api.validation.diffimporttext_diff,html_difffromonnxruntimeimportGraphOptimizationLevel,InferenceSession,SessionOptionsfromonnx_array_api.ext_test_caseimportmeasure_timefromonnx_array_api.ort.ort_optimizersimportort_optimized_modelfilename=example_path("data/small.onnx")optimized=filename+".optimized.onnx"ifnotos.path.exists(optimized):ort_optimized_model(filename,output=optimized)print(optimized)

data/small.onnx.optimized.onnx

Output comparison¶

so=SessionOptions()so.graph_optimization_level=GraphOptimizationLevel.ORT_ENABLE_ALLimg=numpy.random.random((1,3,112,112)).astype(numpy.float32)sess=InferenceSession(filename,so,providers=["CPUExecutionProvider"])sess_opt=InferenceSession(optimized,so,providers=["CPUExecutionProvider"])input_name=sess.get_inputs()[0].nameout=sess.run(None,{input_name:img})[0]out_opt=sess_opt.run(None,{input_name:img})[0]ifout.shape!=out_opt.shape:print("ERROR shape are different{out.shape} !={out_opt.shape}")diff=numpy.abs(out-out_opt).max()print(f"Differences:{diff}")

Differences: 0.0

Difference¶

Unoptimized model.

withopen(filename,"rb")asf:model=load(f)print("first model to text...")text1=onnx_simple_text_plot(model,indent=False)print(text1)

first model to text...opset: domain='' version=11input: name='input' type=dtype('float32') shape=['None', 3, 112, 112]init: name='i0' type=float32 shape=(64,)init: name='i1' type=float32 shape=(64,)init: name='i2' type=float32 shape=(64,)init: name='i3' type=float32 shape=(64,)init: name='i4' type=float32 shape=(1, 2, 7, 7)init: name='i5' type=float32 shape=(64, 3, 3, 3)init: name='i6' type=float32 shape=(64,)init: name='i7' type=float32 shape=(64, 64, 3, 3)init: name='i8' type=float32 shape=(64,)init: name='i9' type=float32 shape=(64, 64, 3, 3)init: name='i10' type=float32 shape=(64,)init: name='i11' type=float32 shape=(64, 64, 1, 1)init: name='i12' type=float32 shape=(64,)init: name='i13' type=float32 shape=(64, 1, 1)init: name='i14' type=float32 shape=(64, 1, 1)Conv(input, i5, i6, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[1,1]) -> r0PRelu(r0, i13) -> r1ReduceMean(r1, axes=[1], keepdims=1) -> r2ReduceMax(r1, axes=[1], keepdims=1) -> r3Concat(r2, r3, axis=1) -> r4Conv(r4, i4, dilations=[1,1], group=1, kernel_shape=[7,7], pads=[3,3,3,3], strides=[1,1]) -> r5Sigmoid(r5) -> r6Mul(r6, r1) -> r7BatchNormalization(r7, i0, i1, i2, i3, epsilon=0.00, momentum=0.90) -> r8Conv(r8, i7, i8, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[1,1]) -> r9PRelu(r9, i14) -> r10Conv(r10, i9, i10, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[2,2]) -> r11Conv(r7, i11, i12, dilations=[1,1], group=1, kernel_shape=[1,1], pads=[0,0,0,0], strides=[2,2]) -> r12Add(r11, r12) -> onnx::BatchNormalization_1830output: name='onnx::BatchNormalization_1830' type=dtype('float32') shape=['None', 64, 56, 56]

Optimized model.

withopen(optimized,"rb")asf:model=load(f)print("second model to text...")text2=onnx_simple_text_plot(model,indent=False)print(text2)

second model to text...opset: domain='' version=11opset: domain='ai.onnx.ml' version=5opset: domain='ai.onnx.training' version=1opset: domain='ai.onnx.preview.training' version=1opset: domain='com.microsoft' version=1opset: domain='com.microsoft.experimental' version=1opset: domain='com.microsoft.nchwc' version=1opset: domain='org.pytorch.aten' version=1input: name='input' type=dtype('float32') shape=['None', 3, 112, 112]init: name='i0' type=float32 shape=(64,)init: name='i1' type=float32 shape=(64,)init: name='i2' type=float32 shape=(64,)init: name='i3' type=float32 shape=(64,)init: name='reorder_token_10' type=float32 shape=(64, 64, 3, 3)init: name='reorder_token_6' type=float32 shape=(64, 64, 3, 3)init: name='i6' type=float32 shape=(64,)init: name='reorder_token_1' type=float32 shape=(8, 2, 7, 7)init: name='i8' type=float32 shape=(64,)init: name='reorder' type=float32 shape=(64, 3, 3, 3)init: name='i10' type=float32 shape=(64,)init: name='reorder_token_3' type=float32 shape=(64, 64, 1, 1)init: name='i12' type=float32 shape=(64,)init: name='i13' type=float32 shape=(64, 1, 1)init: name='i14' type=float32 shape=(64, 1, 1)Conv[com.microsoft.nchwc](input, reorder, i6, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_0ReorderOutput[com.microsoft.nchwc](reorder_token_0, channels_last=0, channels=64) -> r0PRelu(r0, i13) -> r1ReduceMax(r1, keepdims=1, axes=[1]) -> r3ReduceMean(r1, keepdims=1, axes=[1]) -> r2Concat(r2, r3, axis=1) -> r4Conv[com.microsoft.nchwc](r4, reorder_token_1, activation=b'Sigmoid', auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[7,7], pads=[3,3,3,3]) -> reorder_token_2ReorderOutput[com.microsoft.nchwc](reorder_token_2, channels_last=0, channels=1) -> r6Mul(r6, r1) -> r7BatchNormalization(r7, i0, i1, i2, i3, momentum=0.90, epsilon=0.00) -> r8ReorderInput[com.microsoft.nchwc](r8, channels_last=0) -> reorder_token_7Conv[com.microsoft.nchwc](reorder_token_7, reorder_token_6, i8, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_9ReorderOutput[com.microsoft.nchwc](reorder_token_9, channels_last=0, channels=64) -> r9PRelu(r9, i14) -> r10ReorderInput[com.microsoft.nchwc](r10, channels_last=0) -> reorder_token_11ReorderInput[com.microsoft.nchwc](r7, channels_last=0) -> reorder_token_4Conv[com.microsoft.nchwc](reorder_token_4, reorder_token_3, i12, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[2,2], kernel_shape=[1,1], pads=[0,0,0,0]) -> reorder_token_5Conv[com.microsoft.nchwc](reorder_token_11, reorder_token_10, i10, reorder_token_5, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[2,2], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_13ReorderOutput[com.microsoft.nchwc](reorder_token_13, channels_last=0, channels=64) -> onnx::BatchNormalization_1830output: name='onnx::BatchNormalization_1830' type=dtype('float32') shape=['None', 64, 56, 56]

Differences

print("differences...")print(text_diff(text1,text2))

differences...  opset: domain='' version=11+ opset: domain='ai.onnx.ml' version=5+ opset: domain='ai.onnx.training' version=1+ opset: domain='ai.onnx.preview.training' version=1+ opset: domain='com.microsoft' version=1+ opset: domain='com.microsoft.experimental' version=1+ opset: domain='com.microsoft.nchwc' version=1+ opset: domain='org.pytorch.aten' version=1  input: name='input' type=dtype('float32') shape=['None', 3, 112, 112]  init: name='i0' type=float32 shape=(64,)  init: name='i1' type=float32 shape=(64,)  init: name='i2' type=float32 shape=(64,)  init: name='i3' type=float32 shape=(64,)- init: name='i4' type=float32 shape=(1, 2, 7, 7)?             ^^                      ^  ^  ^  ^+ init: name='reorder_token_10' type=float32 shape=(64, 64, 3, 3)?             ^^^^^^^^^^^^^^^^                      ^^  ^^  ^  ^- init: name='i5' type=float32 shape=(64, 3, 3, 3)?             ^^                          ^+ init: name='reorder_token_6' type=float32 shape=(64, 64, 3, 3)?             ^^^^^^^^^^^^^^^                          ^^  init: name='i6' type=float32 shape=(64,)- init: name='i7' type=float32 shape=(64, 64, 3, 3)?             ^^                      ^^  ^^  ^  ^+ init: name='reorder_token_1' type=float32 shape=(8, 2, 7, 7)?             ^^^^^^^^^^^^^^^                      ^  ^  ^  ^  init: name='i8' type=float32 shape=(64,)- init: name='i9' type=float32 shape=(64, 64, 3, 3)?             ^^                          ^^+ init: name='reorder' type=float32 shape=(64, 3, 3, 3)?             ^^^^^^^                          ^  init: name='i10' type=float32 shape=(64,)- init: name='i11' type=float32 shape=(64, 64, 1, 1)?             ^^^+ init: name='reorder_token_3' type=float32 shape=(64, 64, 1, 1)?             ^^^^^^^^^^^^^^^  init: name='i12' type=float32 shape=(64,)  init: name='i13' type=float32 shape=(64, 1, 1)  init: name='i14' type=float32 shape=(64, 1, 1)- Conv(input, i5, i6, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[1,1]) -> r0+ Conv[com.microsoft.nchwc](input, reorder, i6, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_0+ ReorderOutput[com.microsoft.nchwc](reorder_token_0, channels_last=0, channels=64) -> r0  PRelu(r0, i13) -> r1+ ReduceMax(r1, keepdims=1, axes=[1]) -> r3- ReduceMean(r1, axes=[1], keepdims=1) -> r2?                ----------+ ReduceMean(r1, keepdims=1, axes=[1]) -> r2?                          ++++++++++- ReduceMax(r1, axes=[1], keepdims=1) -> r3  Concat(r2, r3, axis=1) -> r4- Conv(r4, i4, dilations=[1,1], group=1, kernel_shape=[7,7], pads=[3,3,3,3], strides=[1,1]) -> r5- Sigmoid(r5) -> r6+ Conv[com.microsoft.nchwc](r4, reorder_token_1, activation=b'Sigmoid', auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[7,7], pads=[3,3,3,3]) -> reorder_token_2+ ReorderOutput[com.microsoft.nchwc](reorder_token_2, channels_last=0, channels=1) -> r6  Mul(r6, r1) -> r7- BatchNormalization(r7, i0, i1, i2, i3, epsilon=0.00, momentum=0.90) -> r8?                                        --------------+ BatchNormalization(r7, i0, i1, i2, i3, momentum=0.90, epsilon=0.00) -> r8?                                                     ++++++++++++++- Conv(r8, i7, i8, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[1,1]) -> r9+ ReorderInput[com.microsoft.nchwc](r8, channels_last=0) -> reorder_token_7+ Conv[com.microsoft.nchwc](reorder_token_7, reorder_token_6, i8, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[1,1], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_9+ ReorderOutput[com.microsoft.nchwc](reorder_token_9, channels_last=0, channels=64) -> r9  PRelu(r9, i14) -> r10- Conv(r10, i9, i10, dilations=[1,1], group=1, kernel_shape=[3,3], pads=[1,1,1,1], strides=[2,2]) -> r11- Conv(r7, i11, i12, dilations=[1,1], group=1, kernel_shape=[1,1], pads=[0,0,0,0], strides=[2,2]) -> r12- Add(r11, r12) -> onnx::BatchNormalization_1830+ ReorderInput[com.microsoft.nchwc](r10, channels_last=0) -> reorder_token_11+ ReorderInput[com.microsoft.nchwc](r7, channels_last=0) -> reorder_token_4+ Conv[com.microsoft.nchwc](reorder_token_4, reorder_token_3, i12, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[2,2], kernel_shape=[1,1], pads=[0,0,0,0]) -> reorder_token_5+ Conv[com.microsoft.nchwc](reorder_token_11, reorder_token_10, i10, reorder_token_5, auto_pad=b'NOTSET', dilations=[1,1], group=1, strides=[2,2], kernel_shape=[3,3], pads=[1,1,1,1]) -> reorder_token_13+ ReorderOutput[com.microsoft.nchwc](reorder_token_13, channels_last=0, channels=64) -> onnx::BatchNormalization_1830  output: name='onnx::BatchNormalization_1830' type=dtype('float32') shape=['None', 64, 56, 56]

HTML version.

print("html differences...")output=html_diff(text1,text2)withopen("diff_html.html","w",encoding="utf-8")asf:f.write(output)print("done.")

html differences...done.

Benchmark¶

img=numpy.random.random((1,3,112,112)).astype(numpy.float32)t1=measure_time(lambda:sess.run(None,{input_name:img}),repeat=25,number=25)t1["name"]="original"print("Original model")pprint(t1)t2=measure_time(lambda:sess_opt.run(None,{input_name:img}),repeat=25,number=25)t2["name"]="optimized"print("Optimized")pprint(t2)

Original model{'average': np.float64(0.0056790061488049106), 'context_size': 64, 'deviation': np.float64(0.0009349826756959479), 'max_exec': np.float64(0.008692911559919594), 'min_exec': np.float64(0.004763298119942192), 'name': 'original', 'number': 25, 'repeat': 25, 'ttime': np.float64(0.14197515372012276)}Optimized{'average': np.float64(0.005954185577592579), 'context_size': 64, 'deviation': np.float64(0.0014574969062307263), 'max_exec': np.float64(0.0113777117599966), 'min_exec': np.float64(0.0048157119199458975), 'name': 'optimized', 'number': 25, 'repeat': 25, 'ttime': np.float64(0.14885463943981447)}

Plots¶

fig,ax=plt.subplots(1,1,figsize=(12,4))df=DataFrame([t1,t2]).set_index("name")df

	average	deviation	min_exec	max_exec	repeat	number	ttime	context_size
name
original	0.005679	0.000935	0.004763	0.008693	25	25	0.141975	64
optimized	0.005954	0.001457	0.004816	0.011378	25	25	0.148855	64

And the graph is:

ax.bar(df.index,df["average"].values,yerr=df["deviation"].values,capsize=6)ax.set_title("Measure performance of optimized model\nlower is better")plt.grid()fig.savefig("plot_optimization.png")

Total running time of the script: (0 minutes 7.619 seconds)

DownloadJupyternotebook:plot_optimization.ipynb

DownloadPythonsourcecode:plot_optimization.py

Downloadzipped:plot_optimization.zip

Gallery generated by Sphinx-Gallery

On this page

Optimization with onnxruntime

Movatterモバイル変換