pyarrow.compute.register_scalar_function#
- pyarrow.compute.register_scalar_function(func,function_name,function_doc,in_types,out_type,func_registry=None)#
Register a user-defined scalar function.
This API is EXPERIMENTAL.
A scalar function is a function that executes elementwiseoperations on arrays or scalars, i.e. a scalar function mustbe computed row-by-row with no state where each output rowis computed only from its corresponding input row.In other words, all argument arrays have the same length,and the output array is of the same length as the arguments.Scalar functions are the only functions allowed in query engineexpressions.
- Parameters:
- func
callable() A callable implementing the user-defined function.The first argument is the context argument of typeUdfContext.Then, it must take arguments equal to the number ofin_types defined. It must return an Array or Scalarmatching the out_type. It must return a Scalar ifall arguments are scalar, else it must return an Array.
To define a varargs function, pass a callable that takes
*args. The last in_type will be the type of all varargsarguments.- function_name
str Name of the function. There should only be one functionregistered with this name in the function registry.
- function_doc
dict A dictionary object with keys “summary” (str),and “description” (str).
- in_types
Dict[str,DataType] A dictionary mapping function argument names totheir respective DataType.The argument names will be used to generatedocumentation for the function. The number ofarguments specified here determines the functionarity.
- out_type
DataType Output type of the function.
- func_registry
FunctionRegistry Optional function registry to use instead of the default global one.
- func
Examples
>>>importpyarrowaspa>>>importpyarrow.computeaspc>>>>>>func_doc={}>>>func_doc["summary"]="simple udf">>>func_doc["description"]="add a constant to a scalar">>>>>>defadd_constant(ctx,array):...returnpc.add(array,1,memory_pool=ctx.memory_pool)>>>>>>func_name="py_add_func">>>in_types={"array":pa.int64()}>>>out_type=pa.int64()>>>pc.register_scalar_function(add_constant,func_name,func_doc,...in_types,out_type)>>>>>>func=pc.get_function(func_name)>>>func.name'py_add_func'>>>answer=pc.call_function(func_name,[pa.array([20])])>>>answer<pyarrow.lib.Int64Array object at ...>[ 21]

