Difficulty to implement an Array API for ONNX ¶

Implementing the full array API is not always easy withonnx.Python is not strongly typed and many different types can be usedto represent a value. Argumentaxis can be an integer or a tuple(seemin from Array APIfor example). On the other side,ReduceMin from ONNXis considered as a tensor.

Performance¶

The Array API must work in eager mode and for every operation,it generates an ONNX graph and executes it with a specificbackend. It can benumpy,onnxruntime or any otherbackend. The generation of every graph takes a significant amount of time.It must be avoided. These graphs are cached. But a graph can be reusedonly if the inputs - by ONNX semantic - change. If a parameter change,a new graph must be cached. MethodJitEager.make_keygenerates a unique key based on the input it receives,the signature of the function to call. If the key is the same,a cached onnx can be reused on the second call.

However, eager mode - use a small single onnx graph for every operation -is not the most efficient one. At the same time, the design must allowto merge every needed operation into a bigger graph.Bigger graphs can be more easily optimized by the backend.

Input vs parameter¶

An input is a tensor or array, a parameter is any other type.Following onnx semantic, an input is variable, a parameter is frozencannot be changed. It is a constant. A good design would beto considered any named input (**kwargs) a parameter andany input (*args) a tensor. But the Array API does not follow thatdesign. Functionastype<https://data-apis.org/array-api/2022.12/API_specification/generated/array_api.astype.html>_takes two inputs. OperatorCast<https://onnx.ai/onnx/operators/onnx__Cast.html>_takes one input and a frozen parameterto.And python allowsastype(x, dtype) as well asastype(x, dtype=dtype)unless the signature enforces one call over another type.There may be ambiguities from time to time.Beside, from onnx point of view, argument dtype should be named.

Tensor type¶

AnEagerTensormust be used to represent any tensor.This class defines the backend to use as well.EagerNumpyTensorfornumpy,EagerOrtTensorforonnxruntime. Since the Array API is new,existing packages do not fully support the API if they support it(scikit-learn). Some numpy array may still be used.

Inplace¶

ONNX has no notion of inplace computation. Therefore somethinglikecoefs[:, 1] = 1 is not valid unless some code is writtento create another tensor. The current design supports some of theseby storing every call to__setitem__. The user seescoefsbut the framework sees thatcoefs holds a reference to anothertensor. That’s the one the framework needs to use. However, since__setitem__ is used for efficiency, it becomes less than efficientwith this design and should be avoided. This assumption may be truewhen the backend is relying on CPU but not on GPU.A function such asempty should be avoided as ithas to be followed by calls to__setitem__.

Eager or compilation¶

Eager mode is what the Array API implies.Every function is converted into an ONNX graph basedon its inputs without any knownledge of how these inputswere obtained. This graph is then executed before goingto the next call of a function from the API.The conversion of a machine learned modelinto ONNX implies the gathering of all these operationsinto a graph. It means using a mode that records all the functioncalls to compile every tiny onnx graph into a unique graph.

Iterators and Reduction¶

An efficient implementation of functionnumpy.any() ornumpy.all() returnsas soon as the result is known.numpy.all() isfalse whenever the first false condition is met.Same goes fornumpy.any() which is truewhenever the first true condition is met.There is no such operator in ONNX (<= 20) becauseit is unlikely to appear in a machine learned model.However, it is highly used when two results arecompared in unit tests. The ONNX implementation isnot efficient due to that reason but it only impactsthe unit tests.

Types¶

onnx supports more types thannumpy does.It is not always easy to deal with bfloat16 or float8 types.

On this page

Difficulty to implement an Array API for ONNX

Movatterモバイル変換

Difficulty to implement an Array API for ONNX¶