pyarrow.dataset.Expression#

classpyarrow.dataset.Expression#

Bases:_Weakrefable

A logical expression to be evaluated against some input.

To create an expression:

  • Use the factory functionpyarrow.compute.scalar() to create ascalar (not necessary when combined, see example below).

  • Use the factory functionpyarrow.compute.field() to referencea field (column in table).

  • Compare fields and scalars with<,<=,==,>=,>.

  • Combine expressions using python operators& (logical and),| (logical or) and~ (logical not).Note: python keywordsand,or andnot cannot be usedto combine expressions.

  • Create expression predicates using Expression methods such aspyarrow.compute.Expression.isin().

Examples

>>>importpyarrow.computeaspc>>>(pc.field("a")<pc.scalar(3))|(pc.field("b")>7)<pyarrow.compute.Expression ((a < 3) or (b > 7))>>>>pc.field('a')!=3<pyarrow.compute.Expression (a != 3)>>>>pc.field('a').isin([1,2,3])<pyarrow.compute.Expression is_in(a, {value_set=int64:[  1,  2,  3], null_matching_behavior=MATCH})>
__init__(*args,**kwargs)#

Methods

__init__(*args, **kwargs)

cast(self[, type, safe, options])

Explicitly set or change the expression's data type.

equals(self, Expression other)

from_substrait(message)

Deserialize an expression from Substrait

is_nan(self)

Check whether the expression is NaN.

is_null(self, bool nan_is_null=False)

Check whether the expression is null.

is_valid(self)

Check whether the expression is not-null (valid).

isin(self, values)

Check whether the expression is contained in values.

to_substrait(self, Schema schema, ...)

Serialize the expression using Substrait

cast(self,type=None,safe=None,options=None)#

Explicitly set or change the expression’s data type.

This creates a new expression equivalent to calling thecast compute function on this expression.

Parameters:
typeDataType, defaultNone

Type to cast array to.

safebool, defaultTrue

Whether to check for conversion errors such as overflow.

optionsCastOptions, defaultNone

Additional checks pass by CastOptions

Returns:
castExpression
equals(self,Expressionother)#
Parameters:
otherpyarrow.dataset.Expression
Returns:
bool
staticfrom_substrait(message)#

Deserialize an expression from Substrait

The serialized message must be an ExtendedExpression message that hasonly a single expression. The name of the expression and the schemathe expression was bound to will be ignored. Usepyarrow.substrait.deserialize_expressions if this information is neededor if the message might contain multiple expressions.

Parameters:
messagebytes orBuffer oraprotobufMessage

The Substrait message to deserialize

Returns:
Expression

The deserialized expression

is_nan(self)#

Check whether the expression is NaN.

This creates a new expression equivalent to calling theis_nan compute function on this expression.

Returns:
is_nanExpression
is_null(self,boolnan_is_null=False)#

Check whether the expression is null.

This creates a new expression equivalent to calling theis_null compute function on this expression.

Parameters:
nan_is_nullbool, defaultFalse

Whether floating-point NaNs are considered null.

Returns:
is_nullExpression
is_valid(self)#

Check whether the expression is not-null (valid).

This creates a new expression equivalent to calling theis_valid compute function on this expression.

Returns:
is_validExpression
isin(self,values)#

Check whether the expression is contained in values.

This creates a new expression equivalent to calling theis_in compute function on this expression.

Parameters:
valuesArray oriterable

The values to check for.

Returns:
isinExpression

A new expression that, when evaluated, checks whetherthis expression’s value is contained invalues.

to_substrait(self,Schemaschema,boolallow_arrow_extensions=False)#

Serialize the expression using Substrait

The expression will be serialized as an ExtendedExpression message that has asingle expression named “expression”

Parameters:
schemaSchema

The input schema the expression will be bound to

allow_arrow_extensionsbool, defaultFalse

If False then only functions that are part of the core Substrait functiondefinitions will be allowed. Set this to True to allow pyarrow-specific functionsbut the result may not be accepted by other compute libraries.

Returns:
Buffer

A buffer containing the serialized Protobuf plan.