pyarrow.table#
- pyarrow.table(data,names=None,schema=None,metadata=None,nthreads=None)#
Create a pyarrow.Table from a Python data structure or sequence of arrays.
- Parameters:
- data
dict,list,pandas.DataFrame, Arrow-compatibletable A mapping of strings to Arrays or Python lists, a list of arrays orchunked arrays, a pandas DataFame, or any tabular object implementingthe Arrow PyCapsule Protocol (has an
__arrow_c_array__,__arrow_c_device_array__or__arrow_c_stream__method).- names
list, defaultNone Column names if list of arrays passed as data. Mutually exclusive with‘schema’ argument.
- schema
Schema, defaultNone The expected schema of the Arrow Table. If not passed, will be inferredfrom the data. Mutually exclusive with ‘names’ argument.If passed, the output will have exactly this schema (raising an errorwhen columns are not found in the data and ignoring additional data notspecified in the schema, when data is a dict or DataFrame).
- metadata
dictor Mapping, defaultNone Optional metadata for the schema (if schema not passed).
- nthreads
int, defaultNone For pandas.DataFrame inputs: if greater than 1, convert columns toArrow in parallel using indicated number of threads. By default,this follows
pyarrow.cpu_count()(may use up to system CPU countthreads).
- data
- Returns:
Examples
>>>importpyarrowaspa>>>n_legs=pa.array([2,4,5,100])>>>animals=pa.array(["Flamingo","Horse","Brittle stars","Centipede"])>>>names=["n_legs","animals"]
Construct a Table from a python dictionary:
>>>pa.table({"n_legs":n_legs,"animals":animals})pyarrow.Tablen_legs: int64animals: string----n_legs: [[2,4,5,100]]animals: [["Flamingo","Horse","Brittle stars","Centipede"]]
Construct a Table from arrays:
>>>pa.table([n_legs,animals],names=names)pyarrow.Tablen_legs: int64animals: string----n_legs: [[2,4,5,100]]animals: [["Flamingo","Horse","Brittle stars","Centipede"]]
Construct a Table from arrays with metadata:
>>>my_metadata={"n_legs":"Number of legs per animal"}>>>pa.table([n_legs,animals],names=names,metadata=my_metadata).scheman_legs: int64animals: string-- schema metadata --n_legs: 'Number of legs per animal'
Construct a Table from pandas DataFrame:
>>>importpandasaspd>>>df=pd.DataFrame({'year':[2020,2022,2019,2021],...'n_legs':[2,4,5,100],...'animals':["Flamingo","Horse","Brittle stars","Centipede"]})>>>pa.table(df)pyarrow.Tableyear: int64n_legs: int64animals: string----year: [[2020,2022,2019,2021]]n_legs: [[2,4,5,100]]animals: [["Flamingo","Horse","Brittle stars","Centipede"]]
Construct a Table from pandas DataFrame with pyarrow schema:
>>>my_schema=pa.schema([...pa.field('n_legs',pa.int64()),...pa.field('animals',pa.string())],...metadata={"n_legs":"Number of legs per animal"})>>>pa.table(df,my_schema).scheman_legs: int64animals: string-- schema metadata --n_legs: 'Number of legs per animal'pandas: '{"index_columns": [], "column_indexes": [{"name": null, ...
Construct a Table from chunked arrays:
>>>n_legs=pa.chunked_array([[2,2,4],[4,5,100]])>>>animals=pa.chunked_array([["Flamingo","Parrot","Dog"],["Horse","Brittle stars","Centipede"]])>>>table=pa.table([n_legs,animals],names=names)>>>tablepyarrow.Tablen_legs: int64animals: string----n_legs: [[2,2,4],[4,5,100]]animals: [["Flamingo","Parrot","Dog"],["Horse","Brittle stars","Centipede"]]

