Using Arrow C++ in your own project#

This section assumes you already have the Arrow C++ libraries on yoursystem, either afterinstalling them using a package manager or afterbuilding them yourself.

The recommended way to integrate the Arrow C++ libraries in your ownC++ project is to use CMake’sfind_packagefunction for locating and integrating dependencies. If you don’t useCMake as a build system, you can usepkg-config to findinstalled the Arrow C++ libraries.

CMake#

Basic usage#

This minimalCMakeLists.txt file compiles amy_example.cc sourcefile into an executable linked with the Arrow C++ shared library:

cmake_minimum_required(VERSION3.25)project(MyExample)find_package(ArrowREQUIRED)add_executable(my_examplemy_example.cc)target_link_libraries(my_examplePRIVATEArrow::arrow_shared)

Available variables and targets#

The directivefind_package(ArrowREQUIRED) asks CMake to find an ArrowC++ installation on your system. When it returns, it will have set a fewCMake variables:

  • ${Arrow_FOUND} is true if the Arrow C++ libraries have been found

  • ${ARROW_VERSION} contains the Arrow version string

  • ${ARROW_FULL_SO_VERSION} contains the Arrow DLL version string

In addition, it will have created some targets that you can link against(note these are plain strings, not variables):

  • Arrow::arrow_shared links to the Arrow shared libraries

  • Arrow::arrow_static links to the Arrow static libraries

For backwards compatibility purposes thearrow_shared andarrow_statictargets are also available but we recommend usingArrow::arrow_shared andArrow::arrow_static respectively.

In most cases, it is recommended to use the Arrow shared libraries.

If Arrow is installed on a custom path instead of a common system one youwill have to add the path where Arrow is installed toCMAKE_PREFIX_PATH.

CMAKE_PREFIX_PATH can be defined as aCMake variable or anenvironment variable.

Your system might already have aCMAKE_PREFIX_PATH environment variabledefined, use the following to expand it with the path to your Arrowinstallation. In this caseARROW_ROOT is expected to contain thepath to your Arrow installation:

exportCMAKE_PREFIX_PATH=${ARROW_ROOT}${CMAKE_PREFIX_PATH:+:${CMAKE_PREFIX_PATH}}

In the case of using a CMake variable you can add it when configuring theproject like the following to contain the possible existingCMAKE_PREFIX_PATH environment variable:

cmake...-DCMAKE_PREFIX_PATH=${ARROW_ROOT}${CMAKE_PREFIX_PATH:+:${CMAKE_PREFIX_PATH}}

Note

The usage ofCOMPONENTS on ourfind_package implementation iscurrently not supported.

Other available packages#

There are other available packages, they can also be used with thefind_package directive.This is the list of available packages:

  • ArrowCUDA

  • ArrowAcero

  • ArrowCompute

  • ArrowDataset

  • ArrowFlight

  • ArrowFlightSql

  • ArrowFlightTesting

  • ArrowSubstrait

  • ArrowTesting

  • Gandiva

  • Parquet

Usage with find_package and target names follows a consistent naming pattern:

  • find_package usage:find_package(PackageNameREQUIRED)

  • Shared Target:PackageName::package_name_shared

  • Static Target:PackageName::package_name_static

For example, to use theArrowCompute package:

  • find_package Usage:find_package(ArrowComputeREQUIRED)

  • Shared Target:ArrowCompute::arrow_compute_shared

  • Static Target:ArrowCompute::arrow_compute_static

Note

CMake is case-sensitive. The names and variables listed above have to bespelt exactly that way!

See also

A Docker-basedminimal build example.

pkg-config#

Basic usage#

You can get suitable build flags by the following command line:

pkg-config--cflags--libsarrow

If you want to link the Arrow C++ static library, you need to add--static option:

pkg-config--cflags--libs--staticarrow

This minimalMakefile file compiles amy_example.cc sourcefile into an executable linked with the Arrow C++ shared library:

my_example:my_example.cc$(CXX)-o$@$(CXXFLAGS)$<$$(pkg-config--cflags--libsarrow)

Many build systems support pkg-config. For example:

Available packages#

The Arrow C++ provides a pkg-config package for each module. Here areall available packages:

  • arrow-csv

  • arrow-cuda

  • arrow-dataset

  • arrow-filesystem

  • arrow-flight-testing

  • arrow-flight

  • arrow-json

  • arrow-orc

  • arrow-python-flight

  • arrow-python

  • arrow-tensorflow

  • arrow-testing

  • arrow

  • gandiva

  • parquet

A Note on Linking#

Some Arrow components have dependencies that you may want to use in your ownproject. Care must be taken to ensure that your project links the same versionof these dependencies in the same way (statically or dynamically) as Arrow,elseODR violations mayresult and your program may crash or silently corrupt data.

In particular, Arrow Flight and its dependenciesProtocol Buffers (Protobuf) andgRPC are likely to cause issues. When using Arrow Flight, notethe following guidelines:

  • If statically linking Arrow Flight, Protobuf and gRPC must also be staticallylinked, and the same goes for dynamic linking.

  • Some platforms (e.g. Ubuntu 20.04 at the time of this writing) may ship aversion of Protobuf and/or gRPC that is not recent enough for ArrowFlight. In that case, Arrow Flight bundles these dependencies, so care mustbe taken not to mix the Arrow Flight library with the platform Protobuf/gRPClibraries (as then you will have two versions of Protobuf and/or gRPC linkedinto your application).

It may be easiest to depend on a version of Arrow built from source, where youcan control the source of each dependency and whether it is statically ordynamically linked. SeeBuilding Arrow C++ for instructions. Oralternatively, use Arrow from a package manager such as Conda or vcpkg whichwill manage consistent versions of Arrow and its dependencies.

Runtime Dependencies#

While Arrow uses the OS-provided timezone database on Linux and macOS, itrequires a user-provided database on Windows. You must download and extract thetext version of the IANA timezone database and add the Windows timezone mappingXML. To download, you can use the following batch script:

By default, the timezone database will be detected at%USERPROFILE%\Downloads\tzdata,but you can set a custom path at runtime inarrow::ArrowGlobalOptions:

arrow::GlobalOptionsoptions;options.timezone_db_path="path/to/tzdata";ARROW_RETURN_NOT_OK(arrow::Initialize(options));