Using Arrow C++ in your own project#
This section assumes you already have the Arrow C++ libraries on yoursystem, either afterinstalling them using a package manager or afterbuilding them yourself.
The recommended way to integrate the Arrow C++ libraries in your ownC++ project is to use CMake’sfind_packagefunction for locating and integrating dependencies. If you don’t useCMake as a build system, you can usepkg-config to findinstalled the Arrow C++ libraries.
CMake#
Basic usage#
This minimalCMakeLists.txt file compiles amy_example.cc sourcefile into an executable linked with the Arrow C++ shared library:
cmake_minimum_required(VERSION3.25)project(MyExample)find_package(ArrowREQUIRED)add_executable(my_examplemy_example.cc)target_link_libraries(my_examplePRIVATEArrow::arrow_shared)
Available variables and targets#
The directivefind_package(ArrowREQUIRED) asks CMake to find an ArrowC++ installation on your system. When it returns, it will have set a fewCMake variables:
${Arrow_FOUND}is true if the Arrow C++ libraries have been found${ARROW_VERSION}contains the Arrow version string${ARROW_FULL_SO_VERSION}contains the Arrow DLL version string
In addition, it will have created some targets that you can link against(note these are plain strings, not variables):
Arrow::arrow_sharedlinks to the Arrow shared librariesArrow::arrow_staticlinks to the Arrow static libraries
For backwards compatibility purposes thearrow_shared andarrow_statictargets are also available but we recommend usingArrow::arrow_shared andArrow::arrow_static respectively.
In most cases, it is recommended to use the Arrow shared libraries.
If Arrow is installed on a custom path instead of a common system one youwill have to add the path where Arrow is installed toCMAKE_PREFIX_PATH.
CMAKE_PREFIX_PATH can be defined as aCMake variable or anenvironment variable.
Your system might already have aCMAKE_PREFIX_PATH environment variabledefined, use the following to expand it with the path to your Arrowinstallation. In this caseARROW_ROOT is expected to contain thepath to your Arrow installation:
exportCMAKE_PREFIX_PATH=${ARROW_ROOT}${CMAKE_PREFIX_PATH:+:${CMAKE_PREFIX_PATH}}
In the case of using a CMake variable you can add it when configuring theproject like the following to contain the possible existingCMAKE_PREFIX_PATH environment variable:
cmake...-DCMAKE_PREFIX_PATH=${ARROW_ROOT}${CMAKE_PREFIX_PATH:+:${CMAKE_PREFIX_PATH}}
Note
The usage ofCOMPONENTS on ourfind_package implementation iscurrently not supported.
Other available packages#
There are other available packages, they can also be used with thefind_package directive.This is the list of available packages:
ArrowCUDA
ArrowAcero
ArrowCompute
ArrowDataset
ArrowFlight
ArrowFlightSql
ArrowFlightTesting
ArrowSubstrait
ArrowTesting
Gandiva
Parquet
Usage with find_package and target names follows a consistent naming pattern:
find_package usage:
find_package(PackageNameREQUIRED)Shared Target:
PackageName::package_name_sharedStatic Target:
PackageName::package_name_static
For example, to use theArrowCompute package:
find_package Usage:
find_package(ArrowComputeREQUIRED)Shared Target:
ArrowCompute::arrow_compute_sharedStatic Target:
ArrowCompute::arrow_compute_static
Note
CMake is case-sensitive. The names and variables listed above have to bespelt exactly that way!
See also
A Docker-basedminimal build example.
pkg-config#
Basic usage#
You can get suitable build flags by the following command line:
pkg-config--cflags--libsarrow
If you want to link the Arrow C++ static library, you need to add--static option:
pkg-config--cflags--libs--staticarrow
This minimalMakefile file compiles amy_example.cc sourcefile into an executable linked with the Arrow C++ shared library:
my_example:my_example.cc$(CXX)-o$@$(CXXFLAGS)$<$$(pkg-config--cflags--libsarrow)
Many build systems support pkg-config. For example:
CMake(But you should use
find_package(Arrow)instead.)
Available packages#
The Arrow C++ provides a pkg-config package for each module. Here areall available packages:
arrow-csv
arrow-cuda
arrow-dataset
arrow-filesystem
arrow-flight-testing
arrow-flight
arrow-json
arrow-orc
arrow-python-flight
arrow-python
arrow-tensorflow
arrow-testing
arrow
gandiva
parquet
A Note on Linking#
Some Arrow components have dependencies that you may want to use in your ownproject. Care must be taken to ensure that your project links the same versionof these dependencies in the same way (statically or dynamically) as Arrow,elseODR violations mayresult and your program may crash or silently corrupt data.
In particular, Arrow Flight and its dependenciesProtocol Buffers (Protobuf) andgRPC are likely to cause issues. When using Arrow Flight, notethe following guidelines:
If statically linking Arrow Flight, Protobuf and gRPC must also be staticallylinked, and the same goes for dynamic linking.
Some platforms (e.g. Ubuntu 20.04 at the time of this writing) may ship aversion of Protobuf and/or gRPC that is not recent enough for ArrowFlight. In that case, Arrow Flight bundles these dependencies, so care mustbe taken not to mix the Arrow Flight library with the platform Protobuf/gRPClibraries (as then you will have two versions of Protobuf and/or gRPC linkedinto your application).
It may be easiest to depend on a version of Arrow built from source, where youcan control the source of each dependency and whether it is statically ordynamically linked. SeeBuilding Arrow C++ for instructions. Oralternatively, use Arrow from a package manager such as Conda or vcpkg whichwill manage consistent versions of Arrow and its dependencies.
Runtime Dependencies#
While Arrow uses the OS-provided timezone database on Linux and macOS, itrequires a user-provided database on Windows. You must download and extract thetext version of the IANA timezone database and add the Windows timezone mappingXML. To download, you can use the following batch script:
By default, the timezone database will be detected at%USERPROFILE%\Downloads\tzdata,but you can set a custom path at runtime inarrow::ArrowGlobalOptions:
arrow::GlobalOptionsoptions;options.timezone_db_path="path/to/tzdata";ARROW_RETURN_NOT_OK(arrow::Initialize(options));

