Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Ctrl+K

pyspark.RDD.take#

RDD.take(num)[source]#

Take the first num elements of the RDD.

It works by first scanning one partition, and use the results fromthat partition to estimate the number of additional partitions neededto satisfy the limit.

Translated from the Scala implementation in RDD#take().

New in version 0.7.0.

Parameters
numint

first number of elements

Returns
list

the firstnum elements

Notes

This method should only be used if the resulting array is expectedto be small, as all the data is loaded into the driver’s memory.

Examples

>>>sc.parallelize([2,3,4,5,6]).cache().take(2)[2, 3]>>>sc.parallelize([2,3,4,5,6]).take(10)[2, 3, 4, 5, 6]>>>sc.parallelize(range(100),100).filter(lambdax:x>90).take(3)[91, 92, 93]

[8]ページ先頭

©2009-2025 Movatter.jp