Search

Enter search terms or a module, class or function name.

Installation¶

The easiest way for the majority of users to install pandas is to install itas part of theAnaconda distribution, across platform distribution for data analysis and scientific computing.This is the recommended installation method for most users.

Instructions for installing from source,PyPI, various Linux distributions, or adevelopment version are also provided.

Python version support¶

Officially Python 2.7, 3.4, and 3.5

Installing pandas¶

Trying out pandas, no installation required!¶

The easiest way to start experimenting with pandas doesn’t involve installingpandas at all.

Wakari is a free service that provides a hostedIPython Notebook service in the cloud.

Simply create an account, and have access to pandas from within your brower viaanIPython Notebook in a few minutes.

Installing pandas with Anaconda¶

Installing pandas and the rest of theNumPy andSciPy stack can be a littledifficult for inexperienced users.

The simplest way to install not only pandas, but Python and the most popularpackages that make up theSciPy stack(IPython,NumPy,Matplotlib, ...) is withAnaconda, a cross-platform(Linux, Mac OS X, Windows) Python distribution for data analytics andscientific computing.

After running a simple installer, the user will have access to pandas and therest of theSciPy stack without needing to installanything else, and without needing to wait for any software to be compiled.

Installation instructions forAnaconda can be found here.

A full list of the packages available as part of theAnaconda distributioncan be found here.

An additional advantage of installing with Anaconda is that you don’t requireadmin rights to install it, it will install in the user’s home directory, andthis also makes it trivial to delete Anaconda at a later date (just deletethat folder).

Installing pandas with Miniconda¶

The previous section outlined how to get pandas installed as part of theAnaconda distribution.However this approach means you will install well over one hundred packagesand involves downloading the installer which is a few hundred megabytes in size.

If you want to have more control on which packages, or have a limited internetbandwidth, then installing pandas withMiniconda may be a better solution.

Conda is the package manager that theAnaconda distribution is built upon.It is a package manager that is both cross-platform and language agnostic(it can play a similar role to a pip and virtualenv combination).

Miniconda allows you to create aminimal self contained Python installation, and then use theConda command to install additional packages.

First you will needConda to be installed anddownloading and running theMinicondawill do this for you. The installercan be found here

The next step is to create a new conda environment (these are analogous to avirtualenv but they also allow you to specify precisely which Python versionto install also). Run the following commands from a terminal window:

condacreate-nname_of_my_envpython

This will create a minimal environment with only Python installed in it.To put your self inside this environment run:

sourceactivatename_of_my_env

On Windows the command is:

activatename_of_my_env

The final step required is to install pandas. This can be done with thefollowing command:

condainstallpandas

To install a specific pandas version:

condainstallpandas=0.13.1

To install other packages, IPython for example:

condainstallipython

To install the fullAnacondadistribution:

condainstallanaconda

If you require any packages that are available to pip but not conda, simplyinstall pip, and use pip to install these packages:

condainstallpippipinstalldjango

Installing from PyPI¶

pandas can be installed via pip fromPyPI.

pipinstallpandas

This will likely require the installation of a number of dependencies,including NumPy, will require a compiler to compile required bits of code,and can take a few minutes to complete.

Installing using your Linux distribution’s package manager.¶

The commands in this table will install pandas for Python 2 from your distribution.To install pandas for Python 3 you may need to use the packagepython3-pandas.

Distribution	Status	Download / Repository Link	Install method
Debian	stable	official Debian repository	`sudoapt-getinstallpython-pandas`
Debian & Ubuntu	unstable (latest packages)	NeuroDebian	`sudoapt-getinstallpython-pandas`
Ubuntu	stable	official Ubuntu repository	`sudoapt-getinstallpython-pandas`
Ubuntu	unstable (daily builds)	PythonXY PPA; activate by:`sudoadd-apt-repositoryppa:pythonxy/pythonxy-devel&&sudoapt-getupdate`	`sudoapt-getinstallpython-pandas`
OpenSuse	stable	OpenSuse Repository	`zypperin python-pandas`
Fedora	stable	official Fedora repository	`dnfinstallpython-pandas`
Centos/RHEL	stable	EPEL repository	`yuminstallpython-pandas`

Installing from source¶

See thecontributing documentation for complete instructions on building from the git source tree. Further, seecreating a development environment if you wish to create apandas development environment.

Running the test suite¶

pandas is equipped with an exhaustive set of unit tests covering about 97% ofthe codebase as of this writing. To run it on your machine to verify thateverything is working (and you have all of the dependencies, soft and hard,installed), make sure you havenose and run:

>>>importpandasaspd>>>pd.test()Running unit tests for pandaspandas version 0.18.0numpy version 1.10.2pandas is installed in pandasPython version 2.7.11 |Continuum Analytics, Inc.|   (default, Dec  6 2015, 18:57:58) [GCC 4.2.1 (Apple Inc. build 5577)]nose version 1.3.7..................................................................S..............S.........................................................................................................................................----------------------------------------------------------------------Ran 9252 tests in 368.339sOK (SKIP=117)

Dependencies¶

setuptools
NumPy: 1.7.1 or higher
python-dateutil: 1.5 or higher
pytz: Needed for time zone support

Recommended Dependencies¶

numexpr: for accelerating certain numerical operations.numexpr uses multiple cores as well as smart chunking and caching to achieve large speedups.If installed, must be Version 2.1 or higher (excluding a buggy 2.4.4). Version 2.4.6 or higher is highly recommended.
bottleneck: for accelerating certain types ofnanevaluations.bottleneck uses specialized cython routines to achieve large speedups.

Note

You are highly encouraged to install these libraries, as they provide large speedups, especiallyif working with large data sets.

Optional Dependencies¶

Cython: Only necessary to build developmentversion. Version 0.19.1 or higher.
SciPy: miscellaneous statistical functions
xarray: pandas like handling for > 2 dims, needed for converting Panels to xarray objects. Version 0.7.0 or higher is recommended.
PyTables: necessary for HDF5-based storage. Version 3.0.0 or higher required, Version 3.2.1 or higher highly recommended.
SQLAlchemy: for SQL database support. Version 0.8.1 or higher recommended. Besides SQLAlchemy, you also need a database specific driver. You can find an overview of supported drivers for each SQL dialect in theSQLAlchemy docs. Some common drivers are:
- psycopg2: for PostgreSQL
- pymysql: for MySQL.
- SQLite: for SQLite, this is included in Python’s standard library by default.
matplotlib: for plotting
For Excel I/O:
- xlrd/xlwt: Excel reading (xlrd) and writing (xlwt)
- openpyxl: openpyxl version 1.6.1or higher (but lower than 2.0.0), or version 2.2 or higher, for writing .xlsx files (xlrd >= 0.9.0)
- XlsxWriter: Alternative Excel writer
Jinja2: Template engine for conditional HTML formatting.
boto: necessary for Amazon S3 access.
blosc: for msgpack compression usingblosc
One ofPyQt4,PySide,pygtk,xsel, orxclip: necessary to useread_clipboard(). Most package managers on Linux distributions will havexclip and/orxsel immediately available for installation.
Google’s`python-gflags <<https://github.com/google/python-gflags/>`__ ,oauth2client ,httplib2andgoogle-api-python-client: Needed forgbq
Backports.lzma: Only for Python 2, for writing to and/or reading from an xz compressed DataFrame in CSV; Python 3 support is built into the standard library.
One of the following combinations of libraries is needed to use thetop-levelread_html() function:
- BeautifulSoup4 andhtml5lib (Any recent version ofhtml5lib isokay.)
- BeautifulSoup4 andlxml
- BeautifulSoup4 andhtml5lib andlxml
- Onlylxml, although seeHTML reading gotchasfor reasons as to why you should probablynot take this approach.
Warning
- if you installBeautifulSoup4 you must install eitherlxml orhtml5lib or both.read_html() willnot work withonlyBeautifulSoup4 installed.
- You are highly encouraged to readHTML reading gotchas. It explains issues surrounding the installation andusage of the above three libraries
- You may need to install an older version ofBeautifulSoup4:Versions 4.2.1, 4.1.3 and 4.0.2 have been confirmed for 64 and 32-bitUbuntu/Debian
- Additionally, if you’re usingAnaconda you should definitelyreadthe gotchas about HTML parsing libraries
Note
- if you’re on a system withapt-get you can do
  sudo apt-get build-dep python-lxml
  to get the necessary dependencies for installation oflxml. Thiswill prevent further headaches down the line.