pyarrow.compute.register_vector_function#
- pyarrow.compute.register_vector_function(func,function_name,function_doc,in_types,out_type,func_registry=None)#
Register a user-defined vector function.
This API is EXPERIMENTAL.
A vector function is a function that executes vectoroperations on arrays. Vector function is often usedwhen compute doesn’t fit other more specific types offunctions (e.g., scalar and aggregate).
- Parameters:
- func
callable() A callable implementing the user-defined function.The first argument is the context argument of typeUdfContext.Then, it must take arguments equal to the number ofin_types defined. It must return an Array or Scalarmatching the out_type. It must return a Scalar ifall arguments are scalar, else it must return an Array.
To define a varargs function, pass a callable that takes*args. The last in_type will be the type of all varargsarguments.
- function_name
str Name of the function. There should only be one functionregistered with this name in the function registry.
- function_doc
dict A dictionary object with keys “summary” (str),and “description” (str).
- in_types
Dict[str,DataType] A dictionary mapping function argument names totheir respective DataType.The argument names will be used to generatedocumentation for the function. The number ofarguments specified here determines the functionarity.
- out_type
DataType Output type of the function.
- func_registry
FunctionRegistry Optional function registry to use instead of the default global one.
- func
Examples
>>>importpyarrowaspa>>>importpyarrow.computeaspc>>>>>>func_doc={}>>>func_doc["summary"]="percent rank">>>func_doc["description"]="compute percent rank">>>>>>deflist_flatten_udf(ctx,x):...returnpc.list_flatten(x)>>>>>>func_name="list_flatten_udf">>>in_types={"array":pa.list_(pa.int64())}>>>out_type=pa.int64()>>>pc.register_vector_function(list_flatten_udf,func_name,func_doc,...in_types,out_type)>>>>>>answer=pc.call_function(func_name,[pa.array([[1,2],[3,4]])])>>>answer<pyarrow.lib.Int64Array object at ...>[ 1, 2, 3, 4]

