pyspark.RDD.takeOrdered#
- RDD.takeOrdered(num,key=None)[source]#
Get the N elements from an RDD ordered in ascending order or asspecified by the optional key function.
New in version 1.0.0.
- Parameters
- numint
top N
- keyfunction, optional
a function used to generate key for comparing
- Returns
- list
the top N elements
Notes
This method should only be used if the resulting array is expectedto be small, as all the data is loaded into the driver’s memory.
Examples
>>>sc.parallelize([10,1,2,9,3,4,5,6,7]).takeOrdered(6)[1, 2, 3, 4, 5, 6]>>>sc.parallelize([10,1,2,9,3,4,5,6,7],2).takeOrdered(6,key=lambdax:-x)[10, 9, 7, 6, 5, 4]>>>sc.emptyRDD().takeOrdered(3)[]