pyarrow.orc.write_table #

pyarrow.orc.write_table(table,where,*,file_version='0.12',batch_size=1024,stripe_size=67108864,compression='uncompressed',compression_block_size=65536,compression_strategy='speed',row_index_stride=10000,padding_tolerance=0.0,dictionary_key_size_threshold=0.0,bloom_filter_columns=None,bloom_filter_fpp=0.05)[source]#

Write a table into an ORC file.

Parameters:

tablepyarrow.lib.Table: The table to be written into the ORC file
wherestr orpyarrow.io.NativeFile: Writable target. For passing Python file objects or byte buffers,see pyarrow.io.PythonFileInterface, pyarrow.io.BufferOutputStreamor pyarrow.io.FixedSizeBufferWriter.
file_version{“0.11”, “0.12”}, default “0.12”: Determine which ORC file version to use.Hive 0.11 / ORC v0is the older versionwhileHive 0.12 / ORC v1is the newer one.
batch_sizeint, default 1024: Number of rows the ORC writer writes at a time.
stripe_sizeint, default 64 * 1024 * 1024: Size of each ORC stripe in bytes.
compressionstr, default ‘uncompressed’: The compression codec.Valid values: {‘UNCOMPRESSED’, ‘SNAPPY’, ‘ZLIB’, ‘LZ4’, ‘ZSTD’}Note that LZ0 is currently not supported.
compression_block_sizeint, default 64 * 1024: Size of each compression block in bytes.
compression_strategystr, default ‘speed’: The compression strategy i.e. speed vs size reduction.Valid values: {‘SPEED’, ‘COMPRESSION’}
row_index_strideint, default 10000: The row index stride i.e. the number of rows peran entry in the row index.
padding_tolerancedouble, default 0.0: The padding tolerance.
dictionary_key_size_thresholddouble, default 0.0: The dictionary key size threshold. 0 to disable dictionary encoding.1 to always enable dictionary encoding.
bloom_filter_columnsNone, set-like or list-like, defaultNone: Columns that use the bloom filter.
bloom_filter_fppdouble, default 0.05: Upper limit of the false-positive rate of the bloom filter.

On this page

Edit on GitHub

Movatterモバイル変換

pyarrow.orc.write_table#

pyarrow.orc.write_table #