pyarrow.table#

pyarrow.table(data,names=None,schema=None,metadata=None,nthreads=None)#

Create a pyarrow.Table from a Python data structure or sequence of arrays.

Parameters:
datadict,list,pandas.DataFrame, Arrow-compatibletable

A mapping of strings to Arrays or Python lists, a list of arrays orchunked arrays, a pandas DataFame, or any tabular object implementingthe Arrow PyCapsule Protocol (has an__arrow_c_array__,__arrow_c_device_array__ or__arrow_c_stream__ method).

nameslist, defaultNone

Column names if list of arrays passed as data. Mutually exclusive with‘schema’ argument.

schemaSchema, defaultNone

The expected schema of the Arrow Table. If not passed, will be inferredfrom the data. Mutually exclusive with ‘names’ argument.If passed, the output will have exactly this schema (raising an errorwhen columns are not found in the data and ignoring additional data notspecified in the schema, when data is a dict or DataFrame).

metadatadict or Mapping, defaultNone

Optional metadata for the schema (if schema not passed).

nthreadsint, defaultNone

For pandas.DataFrame inputs: if greater than 1, convert columns toArrow in parallel using indicated number of threads. By default,this followspyarrow.cpu_count() (may use up to system CPU countthreads).

Returns:
Table

Examples

>>>importpyarrowaspa>>>n_legs=pa.array([2,4,5,100])>>>animals=pa.array(["Flamingo","Horse","Brittle stars","Centipede"])>>>names=["n_legs","animals"]

Construct a Table from a python dictionary:

>>>pa.table({"n_legs":n_legs,"animals":animals})pyarrow.Tablen_legs: int64animals: string----n_legs: [[2,4,5,100]]animals: [["Flamingo","Horse","Brittle stars","Centipede"]]

Construct a Table from arrays:

>>>pa.table([n_legs,animals],names=names)pyarrow.Tablen_legs: int64animals: string----n_legs: [[2,4,5,100]]animals: [["Flamingo","Horse","Brittle stars","Centipede"]]

Construct a Table from arrays with metadata:

>>>my_metadata={"n_legs":"Number of legs per animal"}>>>pa.table([n_legs,animals],names=names,metadata=my_metadata).scheman_legs: int64animals: string-- schema metadata --n_legs: 'Number of legs per animal'

Construct a Table from pandas DataFrame:

>>>importpandasaspd>>>df=pd.DataFrame({'year':[2020,2022,2019,2021],...'n_legs':[2,4,5,100],...'animals':["Flamingo","Horse","Brittle stars","Centipede"]})>>>pa.table(df)pyarrow.Tableyear: int64n_legs: int64animals: string----year: [[2020,2022,2019,2021]]n_legs: [[2,4,5,100]]animals: [["Flamingo","Horse","Brittle stars","Centipede"]]

Construct a Table from pandas DataFrame with pyarrow schema:

>>>my_schema=pa.schema([...pa.field('n_legs',pa.int64()),...pa.field('animals',pa.string())],...metadata={"n_legs":"Number of legs per animal"})>>>pa.table(df,my_schema).scheman_legs: int64animals: string-- schema metadata --n_legs: 'Number of legs per animal'pandas: '{"index_columns": [], "column_indexes": [{"name": null, ...

Construct a Table from chunked arrays:

>>>n_legs=pa.chunked_array([[2,2,4],[4,5,100]])>>>animals=pa.chunked_array([["Flamingo","Parrot","Dog"],["Horse","Brittle stars","Centipede"]])>>>table=pa.table([n_legs,animals],names=names)>>>tablepyarrow.Tablen_legs: int64animals: string----n_legs: [[2,2,4],[4,5,100]]animals: [["Flamingo","Parrot","Dog"],["Horse","Brittle stars","Centipede"]]