pyarrow.dataset.CsvFileFormat#

classpyarrow.dataset.CsvFileFormat(ParseOptionsparse_options=None,default_fragment_scan_options=None,ConvertOptionsconvert_options=None,ReadOptionsread_options=None)#

Bases:FileFormat

FileFormat for CSV files.

Parameters:
parse_optionspyarrow.csv.ParseOptions

Options regarding CSV parsing.

default_fragment_scan_optionsCsvFragmentScanOptions

Default options for fragments scan.

convert_optionspyarrow.csv.ConvertOptions

Options regarding value conversion.

read_optionspyarrow.csv.ReadOptions

General read options.

__init__(*args,**kwargs)#

Methods

__init__(*args, **kwargs)

equals(self, CsvFileFormat other)

inspect(self, file[, filesystem])

Infer the schema of a file.

make_fragment(self, file[, filesystem, ...])

Make a FileFragment from a given file.

make_write_options(self, **kwargs)

Attributes

default_extname#
default_fragment_scan_options#
equals(self,CsvFileFormatother)#
Parameters:
otherpyarrow.dataset.CsvFileFormat
Returns:
bool
inspect(self,file,filesystem=None)#

Infer the schema of a file.

Parameters:
filefile-like object, path-like orstr

The file or file path to infer a schema from.

filesystemFilesystem, optional

Iffilesystem is given,file must be a string and specifiesthe path of the file to read from the filesystem.

Returns:
schemaSchema

The schema inferred from the file

make_fragment(self,file,filesystem=None,Expressionpartition_expression=None,*,file_size=None)#

Make a FileFragment from a given file.

Parameters:
filefile-like object, path-like orstr

The file or file path to make a fragment from.

filesystemFilesystem, optional

Iffilesystem is given,file must be a string and specifiesthe path of the file to read from the filesystem.

partition_expressionExpression, optional

An expression that is guaranteed true for all rows in the fragment. Allowsfragment to be potentially skipped while scanning with a filter.

file_sizeint, optional

The size of the file in bytes. Can improve performance with high-latency filesystemswhen file size needs to be known before reading.

Returns:
fragmentFragment

The file fragment

make_write_options(self,**kwargs)#
Parameters:
**kwargsdict
Returns:
pyarrow.csv.WriteOptions
parse_options#