Developing on Windows#

Like Linux and macOS, we have worked to enable builds to work “out of the box”with CMake for a reasonably large subset of the project.

System Setup#

Microsoft provides the free Visual Studio Community edition. When doingdevelopment in the shell, you must initialize the development environmenteach time you open the shell.

For Visual Studio 2017, execute the following batch script:

"C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\Tools\VsDevCmd.bat"-arch=amd64

For Visual Studio 2019, the script is:

"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\Tools\VsDevCmd.bat"-arch=amd64

One can configure a console emulator likecmder toautomatically launch this when starting a new development console.

Using conda-forge for build dependencies#

Miniconda is a minimal Python distributionincluding theconda package manager. Some members of theApache Arrow community participate in the maintenance ofconda-forge, a community-maintained cross-platform packagerepository for conda.

To useconda-forge for your C++ build dependencies on Windows, firstdownload and install a 64-bit distribution from theMiniconda homepage

To configureconda to use theconda-forge channel by default, launch acommand prompt (cmd.exe), run the initialization command shownabove (vcvarsall.bat orVsDevCmd.bat), thenrun the command:

condaconfig--addchannelsconda-forge

Now, you can bootstrap a build environment (call from the root directory of theArrow codebase):

condacreate-y-narrow-dev--file=ci\conda_env_cpp.txt

Then “activate” this conda environment with:

activatearrow-dev

If the environment has been activated, the Arrow build system willautomatically see the%CONDA_PREFIX% environment variable and use that forresolving the build dependencies. This is equivalent to setting

-DARROW_DEPENDENCY_SOURCE=SYSTEM^-DARROW_PACKAGE_PREFIX=%CONDA_PREFIX%\Library

To use the Visual Studio IDE with this conda environment activated, launch it byrunning the commanddevenv from the same command prompt.

Note that dependencies installed as conda packages are built in release mode andcannot link with debug builds. If you intend to use-DCMAKE_BUILD_TYPE=debugthen you must build the packages from source.-DCMAKE_BUILD_TYPE=relwithdebinfo is also available, which produces a buildthat can both be linked with release libraries and be debugged.

Note

If you run into any problems using conda packages for dependencies, a verycommon problem is mixing packages from thedefaults channel with thosefromconda-forge. You can examine the installed packages in yourenvironment (and their origin) withcondalist

Using vcpkg for build dependencies#

vcpkg is an open source package managerfrom Microsoft. It hosts community-contributed ports of C and C++ packages andtheir dependencies. Arrow includes a manifest filecpp/vcpkg.json that specifieswhich vcpkg packages are required to build the C++ library.

To use vcpkg for C++ build dependencies on Windows, firstinstall andintegratevcpkg. Then change working directory incmd.exe to the root directoryof Arrow and run the command:

vcpkginstall^--tripletx64-windows^--x-manifest-rootcpp^--feature-flags=versions^--clean-after-build

On Windows, vcpkg builds dynamic link libraries by default. Use the tripletx64-windows-static to build static libraries. vcpkg downloads sourcepackages and compiles them locally, so installing dependencies with vcpkg ismore time-consuming than with conda.

Then in yourcmake command, to use dependencies installed by vcpkg, set:

-DARROW_DEPENDENCY_SOURCE=VCPKG

You can optionally set other variables to override the default CMakeconfigurations for vcpkg, including:

  • -DCMAKE_TOOLCHAIN_FILE: by default, the CMake scripts automatically findthe location of the vcpkg CMake toolchain filevcpkg.cmake; use this toinstead specify its location

  • -DVCPKG_TARGET_TRIPLET: by default, the CMake scripts attempt to infer thevcpkgtriplet;use this to instead specify the triplet

  • -DARROW_DEPENDENCY_USE_SHARED: default isON; set toOFF forstatic libraries

  • -DVCPKG_MANIFEST_MODE: default isON; set toOFF to ignore thevcpkg.json manifest file and only look for vcpkg packages that arealready installed under the directory where vcpkg is installed

Building using Visual Studio (MSVC) Solution Files#

Change working directory incmd.exe to the root directory of Arrow and doan out of source build by generating a MSVC solution:

cdcppmkdirbuildcdbuildcmake..-G"Visual Studio 16 2019"-Ax64^-DARROW_BUILD_TESTS=ONcmake--build.--configRelease

For newer versions of Visual Studio, specify the generatorVisualStudio172022 or seecmake--help for availablegenerators.

Building with Ninja and sccache#

TheNinja build system offers better buildparallelization, and the optionalsccache compiler cache keeps track ofpast compilations to avoid running them over and over again (in a way similarto the Unix-specificccache).

Newer versions of Visual Studio include Ninja. To see if your Visual Studioincludes Ninja, run the initialization command shownabove (vcvarsall.bat orVsDevCmd.bat), thenrunninja--version.

If Ninja is not included in your version of Visual Studio, and you are usingconda, activate your conda environment and install Ninja:

activatearrow-devcondainstall-cconda-forgeninja

If you are not using conda,install Ninja from another source.

After installation is complete, change working directory incmd.exe to the root directory of Arrow anddo an out of source build by generating Ninja files:

cdcppmkdirbuildcdbuildcmake-G"Ninja"^-DARROW_BUILD_TESTS=ON^-DGTest_SOURCE=BUNDLED..cmake--build.--configRelease

To usesccache in local storage mode you need to setSCCACHE_DIRenvironment variable before callingcmake:

...setSCCACHE_DIR=%LOCALAPPDATA%\Mozilla\sccachecmake-G"Ninja"^...

Building with NMake#

Change working directory incmd.exe to the root directory of Arrow anddo an out of source build usingnmake:

cdcppmkdirbuildcdbuildcmake-G"NMake Makefiles"..nmake

Building on MSYS2#

You can build on MSYS2 terminal,cmd.exe or PowerShell terminal.

On MSYS2 terminal:

cdcppmkdirbuildcdbuildcmake-G"MSYS Makefiles"..make

Oncmd.exe or PowerShell terminal, you can use the following batchfile:

setlocalREM For 64bitsetMINGW_PACKAGE_PREFIX=mingw-w64-x86_64setMINGW_PREFIX=c:\msys64\mingw64setMSYSTEM=MINGW64setPATH=%MINGW_PREFIX%\bin;c:\msys64\usr\bin;%PATH%rmdir /S /Q cpp\buildmkdir cpp\buildpushd cpp\buildcmake -G"MSYS Makefiles" ..||exit /Bmake||exit /Bpopd

Building on Windows/ARM64 using Ninja and Clang#

Ninja and clang can be used for building library on windows/arm64 platform.

cd cppmkdir buildcd buildsetCC=clang-clsetCXX=clang-clcmake -G"Ninja" ..cmake --build . --config Release

LLVM toolchain for Windows on ARM64 can be downloaded from LLVM release pageLLVM release page

Visual Studio (MSVC) cannot be yet used for compiling win/arm64 build due to compatibility issues for dependencies like xsimd and boost library.

Note: This is only an experimental build for WoA64 as all features are not extensively tested through CI due to lack of infrastructure.

Debug builds#

To build a Debug version of Arrow, you should have pre-installed a Debugversion of Boost. It’s recommended to configurecmake with the followingvariables for Debug build:

  • -DARROW_BOOST_USE_SHARED=OFF: enables static linking with boost debuglibs and simplifies run-time loading of 3rd parties

  • -DBOOST_ROOT: sets the root directory of boost libs. (Optional)

  • -DBOOST_LIBRARYDIR: sets the directory with boost lib files. (Optional)

The command line to build Arrow in Debug mode will look something like this:

cdcppmkdirbuildcdbuildcmake..-G"Visual Studio 15 2017"-Ax64^-DARROW_BOOST_USE_SHARED=OFF^-DCMAKE_BUILD_TYPE=Debug^-DBOOST_ROOT=C:/local/boost_1_63_0^-DBOOST_LIBRARYDIR=C:/local/boost_1_63_0/lib64-msvc-14.0cmake--build.--configDebug

Windows dependency resolution issues#

Because Windows uses.lib files for both static and dynamic linking ofdependencies, the static library sometimes may be named something differentlike%PACKAGE%_static.lib to distinguish itself. If you are staticallylinking some dependencies, we provide some options

  • -DBROTLI_MSVC_STATIC_LIB_SUFFIX=%BROTLI_SUFFIX%

  • -DSNAPPY_MSVC_STATIC_LIB_SUFFIX=%SNAPPY_SUFFIX%

  • -LZ4_MSVC_STATIC_LIB_SUFFIX=%LZ4_SUFFIX%

  • -ZSTD_MSVC_STATIC_LIB_SUFFIX=%ZSTD_SUFFIX%

To get the latest build instructions, you can referenceci/appveyor-built.bat,which is used by automated Appveyor builds.

Statically linking to Arrow on Windows#

The Arrow headers on Windows static library builds (enabled by the CMakeoptionARROW_BUILD_STATIC) use the preprocessor macroARROW_STATIC tosuppress dllimport/dllexport marking of symbols. Projects that statically linkagainst Arrow on Windows additionally need this definition. The Unix builds donot use the macro.

In addition if using-DARROW_FLIGHT=ON,ARROW_FLIGHT_STATIC needs tobe defined, and similarly for-DARROW_FLIGHT_SQL=ON.

project(MyExample)find_package(ArrowREQUIRED)add_executable(my_examplemy_example.cc)target_link_libraries(my_examplePRIVATEarrow_staticarrow_flight_staticarrow_flight_sql_static)target_compile_definitions(my_examplePUBLICARROW_STATICARROW_FLIGHT_STATICARROW_FLIGHT_SQL_STATIC)

Downloading the Timezone Database#

To run some of the compute unit tests on Windows, the IANA timezone databaseand the Windows timezone mapping need to be downloaded first. SeeRuntime Dependencies for download instructions. To set a non-defaultpath for the timezone database while running the unit tests, set theARROW_TIMEZONE_DATABASE environment variable.

Replicating Appveyor Builds#

For people more familiar with linux development but need to replicate a failingappveyor build, here are some rough notes from replicating theStatic_Crt_Build (make unittest will probably still fail but many unittests can be made with there individual make targets).

  1. Microsoft offers trial VMs forWindows with Microsoft Visual Studio.Download and install a version.

  2. Run the VM and installGit,CMake, and Miniconda or Anaconda (these instructions assumeAnaconda). Also install the“Build Tools for Visual Studio”.Make sure to select the C++ toolchain in the installer wizard, and rebootafter installation.

  3. Downloadpre-built Boost debug binaries and installit.

    Run this from an Anaconda/Miniconda command prompt (not PowerShell prompt),and make sure to run “vcvarsall.bat x64” first. The location of vcvarsall.batwill depend, it may be under a different path than commonly indicated,e.g. “C:\ProgramFiles(x86)\MicrosoftVisualStudio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat”with the 2019 build tools.

cd$EXTRACT_BOOST_DIRECTORY.\bootstrap.bat@remThisisforstaticlibrariesneededforstatic_crt_buildinappveyor.\b2link=static--with-filesystem--with-regex--with-systeminstall@remthisshouldputlibrariesandheadersinc:\Boost
  1. Activate anaconda/miniconda:

@remthismightdifferforminicondaC:\Users\User\Anaconda3\Scripts\activate
  1. Clone and change directories to the arrow source code (you might need toinstall git).

  2. Setup environment variables:

@remChangethebuildtypebasedonwhichappveyorjobyouwant.SETJOB=Static_Crt_BuildSETGENERATOR=NinjaSETAPPVEYOR_BUILD_WORKER_IMAGE=VisualStudio2017SETUSE_CLCACHE=falseSETARROW_BUILD_GANDIVA=OFFSETARROW_LLVM_VERSION=8.0.*SETPYTHON=3.9SETARCH=64SETPATH=C:\Users\User\Anaconda3;C:\Users\User\Anaconda3\Scripts;C:\Users\User\Anaconda3\Library\bin;%PATH%SETBOOST_LIBRARYDIR=C:\Boost\libSETBOOST_ROOT=C:\Boost
  1. Run appveyor scripts:

condainstall-cconda-forge--file.\ci\conda_env_cpp.txt.\ci\appveyor-cpp-setup.bat@remthismightfailbutatthispointmostunittestsshouldbebuildablebythereindividualtargets@remseenextlineforexample..\ci\appveyor-cpp-build.bat@remyoucanalsojustinvokecmakedirectlywiththedesiredoptionscmake--build.--configRelease--targetarrow-compute-hash-test