pyspark.RDD.zip #

RDD.zip(other)[source]#

Zips this RDD with another one, returning key-value pairs with thefirst element in each RDD second element in each RDD, etc. Assumesthat the two RDDs have the same number of partitions and the samenumber of elements in each partition (e.g. one was made througha map on the other).

New in version 1.0.0.

Parameters

otherRDD: anotherRDD

Returns

RDD: aRDD containing the zipped key-value pairs

See also

RDD.zipWithIndex()
RDD.zipWithUniqueId()

Examples

>>>rdd1=sc.parallelize(range(0,5))>>>rdd2=sc.parallelize(range(1000,1005))>>>rdd1.zip(rdd2).collect()[(0, 1000), (1, 1001), (2, 1002), (3, 1003), (4, 1004)]

Show Source

Movatterモバイル変換

pyspark.RDD.zip#

pyspark.RDD.zip #