Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Ctrl+K

pyspark.RDD.aggregate#

RDD.aggregate(zeroValue,seqOp,combOp)[source]#

Aggregate the elements of each partition, and then the results for allthe partitions, using a given combine functions and a neutral “zerovalue.”

The functionsop(t1,t2) is allowed to modifyt1 and return itas its result value to avoid object allocation; however, it should notmodifyt2.

The first function (seqOp) can return a different result type, U, thanthe type of this RDD. Thus, we need one operation for merging a T intoan U and one operation for merging two U

New in version 1.1.0.

Parameters
zeroValueU

the initial value for the accumulated result of each partition

seqOpfunction

a function used to accumulate results within a partition

combOpfunction

an associative function used to combine results from different partitions

Returns
U

the aggregated result

Examples

>>>seqOp=(lambdax,y:(x[0]+y,x[1]+1))>>>combOp=(lambdax,y:(x[0]+y[0],x[1]+y[1]))>>>sc.parallelize([1,2,3,4]).aggregate((0,0),seqOp,combOp)(10, 4)>>>sc.parallelize([]).aggregate((0,0),seqOp,combOp)(0, 0)

[8]ページ先頭

©2009-2025 Movatter.jp