Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Ctrl+K

pyspark.RDD.saveAsTextFile#

RDD.saveAsTextFile(path,compressionCodecClass=None)[source]#

Save this RDD as a text file, using string representations of elements.

New in version 0.7.0.

Parameters
pathstr

path to text file

compressionCodecClassstr, optional

fully qualified classname of the compression codec classi.e. “org.apache.hadoop.io.compress.GzipCodec” (None by default)

Examples

>>>importos>>>importtempfile>>>fromfileinputimportinput>>>fromglobimportglob>>>withtempfile.TemporaryDirectory(prefix="saveAsTextFile1")asd1:...path1=os.path.join(d1,"text_file1")......# Write a temporary text file...sc.parallelize(range(10)).saveAsTextFile(path1)......# Load text file as an RDD...''.join(sorted(input(glob(path1+"/part-0000*"))))'0\n1\n2\n3\n4\n5\n6\n7\n8\n9\n'

Empty lines are tolerated when saving to text files.

>>>withtempfile.TemporaryDirectory(prefix="saveAsTextFile2")asd2:...path2=os.path.join(d2,"text2_file2")......# Write another temporary text file...sc.parallelize(['','foo','','bar','']).saveAsTextFile(path2)......# Load text file as an RDD...''.join(sorted(input(glob(path2+"/part-0000*"))))'\n\n\nbar\nfoo\n'

Using compressionCodecClass

>>>fromfileinputimportinput,hook_compressed>>>withtempfile.TemporaryDirectory(prefix="saveAsTextFile3")asd3:...path3=os.path.join(d3,"text3")...codec="org.apache.hadoop.io.compress.GzipCodec"......# Write another temporary text file with specified codec...sc.parallelize(['foo','bar']).saveAsTextFile(path3,codec)......# Load text file as an RDD...result=sorted(input(glob(path3+"/part*.gz"),openhook=hook_compressed))...''.join([r.decode('utf-8')ifisinstance(r,bytes)elserforrinresult])'bar\nfoo\n'

[8]ページ先頭

©2009-2025 Movatter.jp