pyarrow.dataset.Expression#
- classpyarrow.dataset.Expression#
Bases:
_WeakrefableA logical expression to be evaluated against some input.
To create an expression:
Use the factory function
pyarrow.compute.scalar()to create ascalar (not necessary when combined, see example below).Use the factory function
pyarrow.compute.field()to referencea field (column in table).Compare fields and scalars with
<,<=,==,>=,>.Combine expressions using python operators
&(logical and),|(logical or) and~(logical not).Note: python keywordsand,orandnotcannot be usedto combine expressions.Create expression predicates using Expression methods such as
pyarrow.compute.Expression.isin().
Examples
>>>importpyarrow.computeaspc>>>(pc.field("a")<pc.scalar(3))|(pc.field("b")>7)<pyarrow.compute.Expression ((a < 3) or (b > 7))>>>>pc.field('a')!=3<pyarrow.compute.Expression (a != 3)>>>>pc.field('a').isin([1,2,3])<pyarrow.compute.Expression is_in(a, {value_set=int64:[ 1, 2, 3], null_matching_behavior=MATCH})>
- __init__(*args,**kwargs)#
Methods
__init__(*args, **kwargs)cast(self[, type, safe, options])Explicitly set or change the expression's data type.
equals(self, Expression other)from_substrait(message)Deserialize an expression from Substrait
is_nan(self)Check whether the expression is NaN.
is_null(self, bool nan_is_null=False)Check whether the expression is null.
is_valid(self)Check whether the expression is not-null (valid).
isin(self, values)Check whether the expression is contained in values.
to_substrait(self, Schema schema, ...)Serialize the expression using Substrait
- cast(self,type=None,safe=None,options=None)#
Explicitly set or change the expression’s data type.
This creates a new expression equivalent to calling thecast compute function on this expression.
- Parameters:
- Returns:
- cast
Expression
- cast
- equals(self,Expressionother)#
- Parameters:
- Returns:
- staticfrom_substrait(message)#
Deserialize an expression from Substrait
The serialized message must be an ExtendedExpression message that hasonly a single expression. The name of the expression and the schemathe expression was bound to will be ignored. Usepyarrow.substrait.deserialize_expressions if this information is neededor if the message might contain multiple expressions.
- Parameters:
- message
bytesorBufferoraprotobufMessage The Substrait message to deserialize
- message
- Returns:
ExpressionThe deserialized expression
- is_nan(self)#
Check whether the expression is NaN.
This creates a new expression equivalent to calling theis_nan compute function on this expression.
- Returns:
- is_nan
Expression
- is_nan
- is_null(self,boolnan_is_null=False)#
Check whether the expression is null.
This creates a new expression equivalent to calling theis_null compute function on this expression.
- Parameters:
- Returns:
- is_null
Expression
- is_null
- is_valid(self)#
Check whether the expression is not-null (valid).
This creates a new expression equivalent to calling theis_valid compute function on this expression.
- Returns:
- is_valid
Expression
- is_valid
- isin(self,values)#
Check whether the expression is contained in values.
This creates a new expression equivalent to calling theis_in compute function on this expression.
- Parameters:
- Returns:
- isin
Expression A new expression that, when evaluated, checks whetherthis expression’s value is contained invalues.
- isin
- to_substrait(self,Schemaschema,boolallow_arrow_extensions=False)#
Serialize the expression using Substrait
The expression will be serialized as an ExtendedExpression message that has asingle expression named “expression”
- Parameters:
- schema
Schema The input schema the expression will be bound to
- allow_arrow_extensionsbool, default
False If False then only functions that are part of the core Substrait functiondefinitions will be allowed. Set this to True to allow pyarrow-specific functionsbut the result may not be accepted by other compute libraries.
- schema
- Returns:
BufferA buffer containing the serialized Protobuf plan.

