Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Ctrl+K

pyspark.RDD.sortByKey#

RDD.sortByKey(ascending=True,numPartitions=None,keyfunc=<functionRDD.<lambda>>)[source]#

Sorts this RDD, which is assumed to consist of (key, value) pairs.

New in version 0.9.1.

Parameters
ascendingbool, optional, default True

sort the keys in ascending or descending order

numPartitionsint, optional

the number of partitions in newRDD

keyfuncfunction, optional, default identity mapping

a function to compute the key

Returns
RDD

a newRDD

Examples

>>>tmp=[('a',1),('b',2),('1',3),('d',4),('2',5)]>>>sc.parallelize(tmp).sortByKey().first()('1', 3)>>>sc.parallelize(tmp).sortByKey(True,1).collect()[('1', 3), ('2', 5), ('a', 1), ('b', 2), ('d', 4)]>>>sc.parallelize(tmp).sortByKey(True,2).collect()[('1', 3), ('2', 5), ('a', 1), ('b', 2), ('d', 4)]>>>tmp2=[('Mary',1),('had',2),('a',3),('little',4),('lamb',5)]>>>tmp2.extend([('whose',6),('fleece',7),('was',8),('white',9)])>>>sc.parallelize(tmp2).sortByKey(True,3,keyfunc=lambdak:k.lower()).collect()[('a', 3), ('fleece', 7), ('had', 2), ('lamb', 5),...('white', 9), ('whose', 6)]

[8]ページ先頭

©2009-2025 Movatter.jp