pyarrow.RecordBatchReader #

classpyarrow.RecordBatchReader#

Bases:_Weakrefable

Base class for reading stream of record batches.

Record batch readers function as iterators of record batches that alsoprovide the schema (without the need to get any batches).

Warning

Do not call this class’s constructor directly, use one of theRecordBatchReader.from_* functions instead.

Notes

To import and export using the Arrow C stream interface, use the_import_from_c and_export_to_c methods. However, keep in mind thisinterface is intended for expert users.

Examples

>>>importpyarrowaspa>>>schema=pa.schema([('x',pa.int64())])>>>defiter_record_batches():...foriinrange(2):...yieldpa.RecordBatch.from_arrays([pa.array([1,2,3])],schema=schema)>>>reader=pa.RecordBatchReader.from_batches(schema,iter_record_batches())>>>print(reader.schema)x: int64>>>forbatchinreader:...print(batch)pyarrow.RecordBatchx: int64----x: [1,2,3]pyarrow.RecordBatchx: int64----x: [1,2,3]

__init__(*args,**kwargs)#

Methods

`__init__`(args, *kwargs)
`cast`(self, target_schema)	Wrap this reader with one that casts each batch lazily as it is pulled.
`close`(self)	Release any resources associated with the reader.
`from_batches`(Schema schema, batches)	Create RecordBatchReader from an iterable of batches.
`from_stream`(data[, schema])	Create RecordBatchReader from a Arrow-compatible stream object.
`iter_batches_with_custom_metadata`(self)	Iterate over record batches from the stream along with their custom metadata.
`read_all`(self)	Read all record batches as a pyarrow.Table.
`read_next_batch`(self)	Read next RecordBatch from the stream.
`read_next_batch_with_custom_metadata`(self)	Read next RecordBatch from the stream along with its custom metadata.
`read_pandas`(self, **options)	Read contents of stream to a pandas.DataFrame.

Attributes

schema

Shared schema of the record batches in the stream.

cast(self,target_schema)#

Wrap this reader with one that casts each batch lazily as it is pulled.Currently only a safe cast to target_schema is implemented.

Parameters:

target_schemaSchema: Schema to cast to, the names and order of fields must match.

Returns:

RecordBatchReader

close(self)#: Release any resources associated with the reader.

staticfrom_batches(Schemaschema,batches)#

Create RecordBatchReader from an iterable of batches.

Parameters:

schemaSchema: The shared schema of the record batches
batchesIterable[RecordBatch]: The batches that this reader will return.

Returns:

readerRecordBatchReader

staticfrom_stream(data,schema=None)#

Create RecordBatchReader from a Arrow-compatible stream object.

This accepts objects implementing the Arrow PyCapsule Protocol forstreams, i.e. objects that have a__arrow_c_stream__ method.

Parameters:

dataArrow-compatiblestream object: Any object that implements the Arrow PyCapsule Protocol forstreams.
schemaSchema, defaultNone: The schema to which the stream should be casted, if supportedby the stream object.

Returns:

RecordBatchReader

iter_batches_with_custom_metadata(self)#

Iterate over record batches from the stream along with their custommetadata.

Yields:

RecordBatchWithMetadata

read_all(self)#

Read all record batches as a pyarrow.Table.

Returns:

Table

read_next_batch(self)#

Movatterモバイル変換

pyarrow.RecordBatchReader#

pyarrow.RecordBatchReader #