MEP10: Docstring consistency#
Status#
Progress
This is still an on-going effort
Branches and Pull requests#
Abstract#
matplotlib has a great deal of inconsistency between docstrings. Thisnot only makes the docs harder to read, but it is harder oncontributors, because they don't know which specifications to follow.There should be a clear docstring convention that is followedconsistently.
The organization of the API documentation is difficult to follow.Some pages, such as pyplot and axes, are enormous and hard to browse.There should instead be short summary tables that link to detaileddocumentation. In addition, some of the docstrings themselves arequite long and contain redundant information.
Building the documentation takes a long time and uses amake.pyscript rather than a Makefile.
Detailed description#
There are number of new tools and conventions available sincematplotlib started using Sphinx that make life easier. The followingis a list of proposed changes to docstrings, most of which involvethese new features.
Numpy docstring format#
Numpy docstring format:This format divides the docstring into clear sections, each havingdifferent parsing rules that make the docstring easy to read both asraw text and as HTML. We could consider alternatives, or invent ourown, but this is a strong choice, as it's well used and understood inthe Numpy/Scipy community.
Cross references#
Most of the docstrings in matplotlib use explicit "roles" when linkingto other items, for example::func:`myfunction`. As of Sphinx0.4, there is a "default_role" that can be set to "obj", which willpolymorphically link to a Python object of any type. This allows oneto write`myfunction` instead. This makes docstrings much easierto read and edit as raw text. Additionally, Sphinx allows for settinga current module, so links like`~matplotlib.axes.Axes.set_xlim`could be written as`~axes.Axes.set_xlim`.
Overriding signatures#
Many methods in matplotlib use the*args and**kwargs syntaxto dynamically handle the keyword arguments that are accepted by thefunction, or to delegate on to another function. This, however, isoften not useful as a signature in the documentation. For thisreason, many matplotlib methods include something like:
defannotate(self,*args,**kwargs):""" Create an annotation: a piece of text referring to a data point. Call signature:: annotate(s, xy, xytext=None, xycoords='data', textcoords='data', arrowprops=None, **kwargs) """
This can't be parsed by Sphinx, and is rather verbose in raw text. Asof Sphinx 1.1, if theautodoc_docstring_signature config value isset to True, Sphinx will extract a replacement signature from thefirst line of the docstring, allowing this:
defannotate(self,*args,**kwargs):""" annotate(s, xy, xytext=None, xycoords='data', textcoords='data', arrowprops=None, **kwargs) Create an annotation: a piece of text referring to a data point. """
The explicit signature will replace the actual Python one in thegenerated documentation.
Linking rather than duplicating#
Many of the docstrings include long lists of accepted keywords byinterpolating things into the docstring at load time. This makes thedocstrings very long. Also, since these tables are the same acrossmany docstrings, it inserts a lot of redundant information in the docs-- particularly a problem in the printed version.
These tables should be moved to docstrings on functions whose onlypurpose is for help. The docstrings that refer to these tables shouldlink to them, rather than including them verbatim.
autosummary extension#
The Sphinx autosummary extension should be used to generate summarytables, that link to separate pages of documentation. Some classesthat have many methods (e.g.Axes) should be documented withone method per page, whereas smaller classes should have all of theirmethods together.
Examples linking to relevant documentation#
The examples, while helpful at illustrating how to use a feature, donot link back to the relevant docstrings. This could be addressed byadding module-level docstrings to the examples, and then includingthat docstring in the parsed content on the example page. Thesedocstrings could easily include references to any other part of thedocumentation.
Documentation using help() vs. a browser#
Using Sphinx markup in the source allows for good-looking docs in yourbrowser, but the markup also makes the raw text returned using help()look terrible. One of the aims of improving the docstrings should beto make both methods of accessing the docs look good.
Implementation#
The numpydoc extensions should be turned on for matplotlib. Thereis an important question as to whether these should be included inthe matplotlib source tree, or used as a dependency. InstallingNumpy is not sufficient to get the numpydoc extensions -- it's aseparate install procedure. In any case, to the extent that theyrequire customization for our needs, we should endeavor to submitthose changes upstream and not fork them.
Manually go through all of the docstrings and update them to thenew format and conventions. Updating the cross references (from
`:func:`myfunc`to`func`) may be able to besemi-automated. This is a lot of busy work, and perhaps this laborshould be divided on a per-module basis so no single developer isover-burdened by it.Reorganize the API docs using autosummary and
sphinx-autogen.This should hopefully have minimal impact on the narrativedocumentation.Modify the example page generator (
gen_rst.py) so that itextracts the module docstring from the example and includes it in anon-literal part of the example page.Use
sphinx-quickstartto generate a new-style Sphinx Makefile.The following features in the currentmake.pywill have to beaddressed in some other way:Copying of some static content
Specifying a "small" build (only low-resolution PNG files for examples)
Steps 1, 2, and 3 are interdependent. 4 and 5 may be doneindependently, though 5 has some dependency on 3.
Backward compatibility#
As this mainly involves docstrings, there should be minimal impact onbackward compatibility.
Alternatives#
None yet discussed.