Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Ctrl+K

pyspark.RDD.fold#

RDD.fold(zeroValue,op)[source]#

Aggregate the elements of each partition, and then the results for allthe partitions, using a given associative function and a neutral “zero value.”

The functionop(t1,t2) is allowed to modifyt1 and return itas its result value to avoid object allocation; however, it should notmodifyt2.

This behaves somewhat differently from fold operations implementedfor non-distributed collections in functional languages like Scala.This fold operation may be applied to partitions individually, and thenfold those results into the final result, rather than apply the foldto each element sequentially in some defined ordering. For functionsthat are not commutative, the result may differ from that of a foldapplied to a non-distributed collection.

New in version 0.7.0.

Parameters
zeroValueT

the initial value for the accumulated result of each partition

opfunction

a function used to both accumulate results within a partition and combineresults from different partitions

Returns
T

the aggregated result

Examples

>>>fromoperatorimportadd>>>sc.parallelize([1,2,3,4,5]).fold(0,add)15

[8]ページ先頭

©2009-2025 Movatter.jp