Installing PyArrow#

System Compatibility#

PyArrow is regularly built and tested on Windows, macOS and variousLinux distributions. We strongly recommend using a 64-bit system.

Python Compatibility#

PyArrow is currently compatible with Python 3.10, 3.11, 3.12 and 3.13.

Using Conda#

Install the latest version of PyArrow fromconda-forge usingConda:

condainstall-cconda-forgepyarrow

Note

While thepyarrowconda-forge package isthe right choice for most users, both a minimal and maximal variant of thepackage exist, either of which may be better for your use case. SeeDifferences between conda-forge packages.

Using Pip#

Install the latest version fromPyPI (Windows, Linux,and macOS):

pipinstallpyarrow

If you encounter any importing issues of the pip wheels on Windows, you mayneed to install thelatest Visual C++ Redistributable for Visual Studio.

Warning

On Linux, you will need pip >= 19.0 to detect the prebuilt binary packages.

Installing nightly packages or from source#

SeePython Development.

Dependencies#

Optional dependencies

  • NumPy 1.21.2 or higher.

  • pandas 1.3.4 or higher,

  • cffi.

Additional packages PyArrow is compatible with arefsspecandpytz,dateutil ortzdata package for timezones.

tzdata on Windows#

While Arrow uses the OS-provided timezone database on Linux and macOS, it requires auser-provided database on Windows. To download and extract the text version ofthe IANA timezone database follow the instructions in the C++Runtime Dependencies or use pyarrow utility functionpyarrow.util.download_tzdata_on_windows() that does the same.

By default, the timezone database will be detected at%USERPROFILE%\Downloads\tzdata.If the database has been downloaded in a different location, you will need to seta custom path to the database from Python:

>>>importpyarrowaspa>>>pa.set_timezone_db_path("custom_path")

You may encounter problems writing datetime data to an ORC file if you installpyarrow with pip. One possible solution to fix this problem:

  1. Install tzdata withpipinstalltzdata

  2. Set the environment variableTZDIR=path\to\.venv\Lib\site-packages\tzdata\

You can find wheretzdata is installed with the following pythoncommand:

>>>importtzdata>>>print(tzdata.__file__)path\to\.venv\Lib\site-packages\tzdata\__init__.py

Differences between conda-forge packages#

Onconda-forge, PyArrow is published as threeseparate packages, each providing varying levels of functionality. This is incontrast to PyPi, where only a single PyArrow package is provided.

The purpose of this split is to minimize the size of the installed package formost users (pyarrow), provide a smaller, minimal package for specialized usecases (pyarrow-core), while still providing a complete package for users whorequire it (pyarrow-all). What was historicallypyarrow onconda-forge is nowpyarrow-all, though mostusers can continue usingpyarrow.

Thepyarrow-core package includes the following functionality:

Thepyarrow package adds the following:

  • Acero (i.e.,pyarrow.acero)

  • Tabular Datasets (i.e.,pyarrow.dataset)

  • Parquet (i.e.,pyarrow.parquet)

  • Substrait (i.e.,pyarrow.substrait)

Finally,pyarrow-all adds:

  • Arrow Flight RPC and Flight SQL (i.e.,pyarrow.flight)

  • Gandiva (i.e.,pyarrow.gandiva)

The following table lists the functionality provided by each package and may beuseful when deciding to use one package over another or whenCreating A Custom Selection.

Component

Package

pyarrow-core

pyarrow

pyarrow-all

Core

pyarrow-core

Parquet

libparquet

Dataset

libarrow-dataset

Acero

libarrow-acero

Substrait

libarrow-substrait

Flight

libarrow-flight

Flight SQL

libarrow-flight-sql

Gandiva

libarrow-gandiva

Creating A Custom Selection#

If you know which components you need and want to control what’s installed, youcan create a custom selection of packages to include only the extra features youneed. For example, to installpyarrow-core and add support for reading andwriting Parquet, installlibparquet alongsidepyarrow-core:

condainstall-cconda-forgepyarrow-corelibparquet

Or if you wish to usepyarrow but need support for Flight RPC:

condainstall-cconda-forgepyarrowlibarrow-flight