Movatterモバイル変換

Skip to main content

Grouping #

`GroupedData.agg`(*exprs)	Compute aggregates and returns the result as a`DataFrame`.
`GroupedData.apply`(udf)	It is an alias of`pyspark.sql.GroupedData.applyInPandas()`; however, it takes a`pyspark.sql.functions.pandas_udf()` whereas`pyspark.sql.GroupedData.applyInPandas()` takes a Python native function.
`GroupedData.applyInArrow`(func, schema)	Maps each group of the current`DataFrame` using an Arrow udf and returns the result as aDataFrame.
`GroupedData.applyInPandas`(func, schema)	Maps each group of the current`DataFrame` using a pandas udf and returns the result as aDataFrame.
`GroupedData.applyInPandasWithState`(func, ...)	Applies the given function to each group of data, while maintaining a user-defined per-group state.
`GroupedData.avg`(*cols)	Computes average values for each numeric columns for each group.
`GroupedData.cogroup`(other)	Cogroups this group with another group so that we can run cogrouped operations.
`GroupedData.count`()	Counts the number of records for each group.
`GroupedData.max`(*cols)	Computes the max value for each numeric columns for each group.
`GroupedData.mean`(*cols)	Computes average values for each numeric columns for each group.
`GroupedData.min`(*cols)	Computes the min value for each numeric column for each group.
`GroupedData.pivot`(pivot_col[, values])	Pivots a column of the current`DataFrame` and performs the specified aggregation.
`GroupedData.sum`(*cols)	Computes the sum for each numeric columns for each group.
`GroupedData.transformWithStateInPandas`(...)	Invokes methods defined in the stateful processor used in arbitrary state API v2.
`PandasCogroupedOps.applyInArrow`(func, schema)	Applies a function to each cogroup using Arrow and returns the result as aDataFrame.
`PandasCogroupedOps.applyInPandas`(func, schema)	Applies a function to each cogroup using pandas and returns the result as aDataFrame.

[8]ページ先頭

©2009-2025 Movatter.jp