- Notifications
You must be signed in to change notification settings - Fork64
Numba extension for compiling Pandas data frames, Intel® Scalable Dataframe Compiler
License
IntelPython/sdc
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Intel® Scalable Dataframe Compiler (Intel® SDC) is an extension ofNumba*that enables compilation ofPandas* operations. It automatically vectorizes and parallelizesthe code by leveraging modern hardware instructions and by utilizing all available cores.
Intel® SDC documentation can be foundhere.
Note
For maximum performance and stability, please use numba fromintel/label/beta
channel.
Intel® SDC is available on the Anaconda Cloudintel/label/beta
channel.Distribution includes Intel® SDC for Python 3.6 and Python 3.7 for Windows and Linux platforms.
Intel® SDC conda package can be installed using the steps below:
> conda create -n sdc-env python=<3.7 or 3.6> -c anaconda -c conda-forge> conda activate sdc-env> conda install sdc -c intel/label/beta -c intel -c defaults -c conda-forge --override-channels
Intel® SDC wheel package can be installed using the steps below:
> conda create -n sdc-env python=<3.7 or 3.6> pip -c anaconda -c conda-forge> conda activate sdc-env> pip install --index-url https://pypi.anaconda.org/intel/label/beta/simple --extra-index-url https://pypi.anaconda.org/intel/simple --extra-index-url https://pypi.org/simple sdc
We useAnaconda distribution ofPython for setting up Intel® SDC build environment.
If you do not have conda, we recommend using Miniconda3:
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.shchmod +x miniconda.sh./miniconda.sh -bexport PATH=$HOME/miniconda3/bin:$PATH
Note
For maximum performance and stability, please use numba fromintel/label/beta
channel.
It is possible to build Intel® SDC via conda-build or setuptools. Follow one of thecases below to install Intel® SDC and its dependencies on Linux.
PYVER=<3.6 or 3.7>NUMPYVER=<1.16 or 1.17>conda create -n conda-build-env python=$PYVER conda-buildsource activate conda-build-envgit clone https://github.com/IntelPython/sdc.gitcd sdcconda build --python $PYVER --numpy $NUMPYVER --output-folder=<output_folder> -c intel/label/beta -c defaults -c intel -c conda-forge --override-channels conda-recipe
export PYVER=<3.6 or 3.7>export NUMPYVER=<1.16 or 1.17>conda create -n sdc-env -q -y -c intel/label/beta -c defaults -c intel -c conda-forge python=$PYVER numpy=$NUMPYVER tbb-devel tbb4py numba=0.54.1 pandas=1.3.4 pyarrow=4.0.1 gcc_linux-64 gxx_linux-64source activate sdc-envgit clone https://github.com/IntelPython/sdc.gitcd sdcpython setup.py install
In case of issues, reinstalling in a new conda environment is recommended.
Building Intel® SDC on Windows requires Build Tools for Visual Studio 2019 (with component MSVC v140 - VS 2015 C++ build tools (v14.00)):
- InstallBuild Tools for Visual Studio 2019 (with component MSVC v140 - VS 2015 C++ build tools (v14.00)).
- InstallMiniconda for Windows.
- Start 'Anaconda prompt'.
It is possible to build Intel® SDC via conda-build or setuptools. Follow one of thecases below to install Intel® SDC and its dependencies on Windows.
set PYVER=<3.6 or 3.7>set NUMPYVER=<1.16 or 1.17>conda create -n conda-build-env -q -y python=%PYVER% conda-build conda-verify vc vs2015_runtime vs2015_win-64conda activate conda-build-envgit clone https://github.com/IntelPython/sdc.gitcd sdcconda build --python %PYVER% --numpy %NUMPYVER% --output-folder=<output_folder> -c intel/label/beta -c defaults -c intel -c conda-forge --override-channels conda-recipe
set PYVER=<3.6 or 3.7>set NUMPYVER=<1.16 or 1.17>conda create -n sdc-env -c intel/label/beta -c defaults -c intel -c conda-forge python=%PYVER% numpy=%NUMPYVER% tbb-devel tbb4py numba=0.54.1 pandas=1.3.4 pyarrow=4.0.1conda activate sdc-envset INCLUDE=%INCLUDE%;%CONDA_PREFIX%\Library\includeset LIB=%LIB%;%CONDA_PREFIX%\Library\libgit clone https://github.com/IntelPython/sdc.gitcd sdcpython setup.py install
- If the
cl
compiler throws the error fatalerror LNK1158: cannot run 'rc.exe'
,add Windows Kits to your PATH (e.g.C:\Program Files (x86)\Windows Kits\8.0\bin\x86
). - Some errors can be mitigated by
set DISTUTILS_USE_SDK=1
. - For setting up Visual Studio, one might need go to registry at
HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\VisualStudio\SxS\VS7
,and add a string value named14.0
whose data isC:\Program Files (x86)\Microsoft Visual Studio 14.0\
. - Sometimes if the conda version or visual studio version being used are not latest thenbuilding Intel® SDC can throw some vague error about a keyword used in a file.So make sure you are using the latest versions.
Building Intel® SDC User's Guide documentation requires pre-installed Intel® SDC packagealong with compatiblePandas* version as well asSphinx* 2.2.1 or later.
Intel® SDC documentation includes Intel® SDC examples output which is pasted to functions description in the API Reference.
Usepip
to installSphinx* and extensions:
pip install sphinx sphinxcontrib-programoutput
Currently the build precedure is based onmake
located at./sdc/docs/
folder.While it is not generally required we recommended that you clean up the system from previous documentaiton build by running:
make clean
To build HTML documentation you will need to run:
make html
The built documentation will be located in the./sdc/docs/build/html
directory.To preview the documentation openindex.html
file.
More information about building and adding documentation can be foundhere.
python sdc/tests/gen_test_data.pypython -m unittest
Intel® SDC follows ideas and initial code base of High-Performance Analytics Toolkit (HPAT). These academic papers describe ideas and methods behind HPAT:
About
Numba extension for compiling Pandas data frames, Intel® Scalable Dataframe Compiler
Topics
Resources
License
Code of conduct
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.