Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Ctrl+K

Grouping#

GroupedData.agg(*exprs)

Compute aggregates and returns the result as aDataFrame.

GroupedData.apply(udf)

It is an alias ofpyspark.sql.GroupedData.applyInPandas(); however, it takes apyspark.sql.functions.pandas_udf() whereaspyspark.sql.GroupedData.applyInPandas() takes a Python native function.

GroupedData.applyInArrow(func, schema)

Maps each group of the currentDataFrame using an Arrow udf and returns the result as aDataFrame.

GroupedData.applyInPandas(func, schema)

Maps each group of the currentDataFrame using a pandas udf and returns the result as aDataFrame.

GroupedData.applyInPandasWithState(func, ...)

Applies the given function to each group of data, while maintaining a user-defined per-group state.

GroupedData.avg(*cols)

Computes average values for each numeric columns for each group.

GroupedData.cogroup(other)

Cogroups this group with another group so that we can run cogrouped operations.

GroupedData.count()

Counts the number of records for each group.

GroupedData.max(*cols)

Computes the max value for each numeric columns for each group.

GroupedData.mean(*cols)

Computes average values for each numeric columns for each group.

GroupedData.min(*cols)

Computes the min value for each numeric column for each group.

GroupedData.pivot(pivot_col[, values])

Pivots a column of the currentDataFrame and performs the specified aggregation.

GroupedData.sum(*cols)

Computes the sum for each numeric columns for each group.

GroupedData.transformWithStateInPandas(...)

Invokes methods defined in the stateful processor used in arbitrary state API v2.

PandasCogroupedOps.applyInArrow(func, schema)

Applies a function to each cogroup using Arrow and returns the result as aDataFrame.

PandasCogroupedOps.applyInPandas(func, schema)

Applies a function to each cogroup using pandas and returns the result as aDataFrame.


[8]ページ先頭

©2009-2025 Movatter.jp