MEP11: Third-party dependencies#

This MEP attempts to improve the way in which third-party dependenciesin matplotlib are handled.

Status#

Completed -- needs to be merged

Branches and Pull requests#

#1157: Use automatic dependency resolution

#1290: Debundle pyparsing

#1261: Update six to 1.2

Abstract#

One of the goals of matplotlib has been to keep it as easy to installas possible. To that end, some third-party dependencies are includedin the source tree and, under certain circumstances, installedalongside matplotlib. This MEP aims to resolve some problems withthat approach, bring some consistency, while continuing to makeinstallation convenient.

At the time that was initially done,setuptools,easy_install andPyPI were not mature enough to be relied on. However, at present,we should be able to safely leverage the "modern" versions of thosetools,distribute andpip.

While matplotlib has dependencies on both Python libraries and C/C++libraries, this MEP addresses only the Python libraries so as to notconfuse the issue. C libraries represent a larger and mostlyorthogonal set of problems.

Detailed description#

matplotlib depends on the following third-party Python libraries:

  • Numpy

  • dateutil (pure Python)

  • pytz (pure Python)

  • six -- required by dateutil (pure Python)

  • pyparsing (pure Python)

  • PIL (optional)

  • GUI frameworks: pygtk, gobject, tkinter, PySide, PyQt4, wx (alloptional, but one is required for an interactive GUI)

Current behavior#

When installing from source, agit checkout orpip:

  • setup.py attempts toimportnumpy. If this fails, theinstallation fails.

  • For each ofdateutil,pytz andsix,setup.py attempts toimport them (from the top-level namespace). If that fails,matplotlib installs its local copy of the library into thetop-level namespace.

  • pyparsing is always installed inside of the matplotlibnamespace.

This behavior is most surprising when used withpip, because nopip dependency resolution is performed, even though it is likely towork for all of these packages.

The fact thatpyparsing is installed in the matplotlib namespace hasreportedly (#1290) confused some users into thinking it is amatplotlib-related module and import it from there rather than thetop-level.

When installing using the Windows installer,dateutil,pytz andsix are installed at the top-levelalways, potentially overwritingalready installed copies of those libraries.

TODO: Describe behavior with the OS-X installer.

When installing using a package manager (Debian, RedHat, MacPortsetc.), this behavior actually does the right thing, and there are nospecial patches in the matplotlib packages to deal with the fact thatwe handledateutil,pytz andsix in this way. However, careshould be taken that whatever approach we move to continues to work inthat context.

Maintaining these packages in the matplotlib tree and making sure theyare up-to-date is a maintenance burden. Advanced new features thatmay require a third-party pure Python library have a higher barrier toinclusion because of this burden.

Desired behavior#

Third-party dependencies are downloaded and installed from theircanonical locations by leveragingpip,distribute andPyPI.

dateutil,pytz, andpyparsing should be made into optionaldependencies -- though obviously some features would fail if theyaren't installed. This will allow the user to decide whether theywant to bother installing a particular feature.

Implementation#

For installing from source, and assuming the user has all of theC-level compilers and dependencies, this can be accomplished fairlyeasily usingdistribute and following the instructionshere. The only anticipatedchange to the matplotlib library code will be to importpyparsingfrom the top-level namespace rather than from within matplotlib. Notethatdistribute will also allow us to remove the direct dependencyonsix, since it is, strictly speaking, only a direct dependency ofdateutil.

For binary installations, there are a number of alternatives (hereordered from best/hardest to worst/easiest):

  1. The distutils wininst installer allows a post-install script torun. It might be possible to get this script to runpip toinstall the other dependencies. (Seethis threadfor someone who has trod that ground before).

  2. Continue to shipdateutil,pytz,six andpyparsing inour installer, but use the post-install-script to install themonly if they cannot already be found.

  3. Move all of these packages inside a (new)matplotlib.externnamespace so it is clear for outside users that these areexternal packages. Add some conditional imports in the corematplotlib codebase sodateutil (at the top-level) is triedfirst, and failing thatmatplotlib.extern.dateutil is used.

2 and 3 are undesirable as they still require maintaining copies ofthese packages in our tree -- and this is exacerbated by the fact thatthey are used less -- only in the binary installers. None of these 3approaches address Numpy, which will still have to be manuallyinstalled using an installer.

TODO: How does this relate to the Mac OS-X installer?

Backward compatibility#

At present, matplotlib can be installed from source on a machinewithout the third party dependencies and without an internetconnection. After this change, an internet connection (and a workingPyPI) will be required to install matplotlib for the first time.(Subsequent matplotlib updates or development work will run withoutaccessing the network).

Alternatives#

Distributing binary eggs doesn't feel like a usable solution. Thatrequires gettingeasy_install installed first, and Windows usersgenerally prefer the well known.exe or.msi installer that worksout of the box.