pyarrow.compute.register_scalar_function#

pyarrow.compute.register_scalar_function(func,function_name,function_doc,in_types,out_type,func_registry=None)#

Register a user-defined scalar function.

This API is EXPERIMENTAL.

A scalar function is a function that executes elementwiseoperations on arrays or scalars, i.e. a scalar function mustbe computed row-by-row with no state where each output rowis computed only from its corresponding input row.In other words, all argument arrays have the same length,and the output array is of the same length as the arguments.Scalar functions are the only functions allowed in query engineexpressions.

Parameters:
funccallable()

A callable implementing the user-defined function.The first argument is the context argument of typeUdfContext.Then, it must take arguments equal to the number ofin_types defined. It must return an Array or Scalarmatching the out_type. It must return a Scalar ifall arguments are scalar, else it must return an Array.

To define a varargs function, pass a callable that takes*args. The last in_type will be the type of all varargsarguments.

function_namestr

Name of the function. There should only be one functionregistered with this name in the function registry.

function_docdict

A dictionary object with keys “summary” (str),and “description” (str).

in_typesDict[str,DataType]

A dictionary mapping function argument names totheir respective DataType.The argument names will be used to generatedocumentation for the function. The number ofarguments specified here determines the functionarity.

out_typeDataType

Output type of the function.

func_registryFunctionRegistry

Optional function registry to use instead of the default global one.

Examples

>>>importpyarrowaspa>>>importpyarrow.computeaspc>>>>>>func_doc={}>>>func_doc["summary"]="simple udf">>>func_doc["description"]="add a constant to a scalar">>>>>>defadd_constant(ctx,array):...returnpc.add(array,1,memory_pool=ctx.memory_pool)>>>>>>func_name="py_add_func">>>in_types={"array":pa.int64()}>>>out_type=pa.int64()>>>pc.register_scalar_function(add_constant,func_name,func_doc,...in_types,out_type)>>>>>>func=pc.get_function(func_name)>>>func.name'py_add_func'>>>answer=pc.call_function(func_name,[pa.array([20])])>>>answer<pyarrow.lib.Int64Array object at ...>[  21]