Contributing to the documentation#
Contributing to the documentation benefits everyone who uses pandas.We encourage you to help us improve the documentation, andyou don’t have to be an expert on pandas to do so! In fact,there are sections of the docs that are worse off after being written byexperts. If something in the docs doesn’t make sense to you, updating therelevant section after you figure it out is a great way to ensure it will helpthe next person. Please visit theissues pagefor a full list of issues that are currently open regarding thepandas documentation.
About the pandas documentation#
The documentation is written inreStructuredText, which is almost like writingin plain English, and built usingSphinx. TheSphinx Documentation has an excellentintroduction to reST. Review the Sphinx docs to perform morecomplex changes to the documentation as well.
Some other important things to know about the docs:
The pandas documentation consists of two parts: the docstrings in the codeitself and the docs in this folder
doc/.The docstrings provide a clear explanation of the usage of the individualfunctions, while the documentation in this folder consists of tutorial-likeoverviews per topic together with some other information (what’s new,installation, etc).
The docstrings follow a pandas convention, based on theNumpy DocstringStandard. Follow thepandas docstring guide for detailedinstructions on how to write a correct docstring.
The tutorials make heavy use of theIPython directive sphinx extension.This directive lets you put code in the documentation which will be runduring the doc build. For example:
..ipython::pythonx=2x**3
will be rendered as:
In[1]:x=2In[2]:x**3Out[2]:8
Almost all code examples in the docs are run (and the output saved) during thedoc build. This approach means that code examples will always be up to date,but it does make the doc building a bit more complex.
Our API documentation files in
doc/source/referencehouse the auto-generateddocumentation from the docstrings. For classes, there are a few subtletiesaround controlling which methods and attributes have pages auto-generated.We have two autosummary templates for classes.
_templates/autosummary/class.rst. Use this when you want toautomatically generate a page for every public method and attribute on theclass. TheAttributesandMethodssections will be automaticallyadded to the class’ rendered documentation by numpydoc. SeeDataFramefor an example._templates/autosummary/class_without_autosummary. Use this when youwant to pick a subset of methods / attributes to auto-generate pages for.When using this template, you should include anAttributesandMethodssection in the class docstring. SeeCategoricalIndexfor anexample.
Every method should be included in a
toctreein one of the documentation files indoc/source/reference, else Sphinxwill emit a warning.
The utility scriptscripts/validate_docstrings.py can be used to get a csvsummary of the API documentation. And also validate common errors in the docstringof a specific class, function or method. The summary also compares the list ofmethods documented in the files indoc/source/reference (which is used to generatetheAPI Reference page)and the actual public methods.This will identify methods documented indoc/source/reference that are not actuallyclass methods, and existing methods that are not documented indoc/source/reference.
Updating a pandas docstring#
When improving a single function or method’s docstring, it is not necessarilyneeded to build the full documentation (see next section).However, there is a script that checks a docstring (for example for theDataFrame.mean method):
pythonscripts/validate_docstrings.pypandas.DataFrame.mean
This script will indicate some formatting errors if present, and will alsorun and test the examples included in the docstring.Check thepandas docstring guide for a detailed guideon how to format the docstring.
The examples in the docstring (‘doctests’) must be valid Python code,that in a deterministic way returns the presented output, and that can becopied and run by users. This can be checked with the script above, and isalso tested on Travis. A failing doctest will be a blocker for merging a PR.Check theexamples section in the docstring guidefor some tips and tricks to get the doctests passing.
When doing a PR with a docstring update, it is good to post theoutput of the validation script in a comment on github.
How to build the pandas documentation#
Requirements#
First, you need to have a development environment to be able to build pandas(see the docs oncreating a development environment).
Building the documentation#
So how do you build the docs? Navigate to your localdoc/ directory in the console and run:
pythonmake.pyhtml
Then you can find the HTML output in the folderdoc/build/html/.
The first time you build the docs, it will take quite a while because it has to runall the code examples and build all the generated docstring pages. In subsequentevocations, sphinx will try to only build the pages that have been modified.
If you want to do a full clean build, do:
pythonmake.pycleanpythonmake.pyhtml
Tip
Ifpythonmake.pyhtml exits with an error status,try running the commandpythonmake.pyhtml--num-jobs=1to identify the cause of the error.
You can tellmake.py to compile only a single section of the docs, greatlyreducing the turn-around time for checking your changes.
# omit autosummary and API sectionpythonmake.pycleanpythonmake.py--no-api# compile the docs with only a single section, relative to the "source" folder.# For example, compiling only this guide (doc/source/development/contributing.rst)pythonmake.pycleanpythonmake.py--singledevelopment/contributing.rst# compile the reference docs for a single functionpythonmake.pycleanpythonmake.py--singlepandas.DataFrame.join# compile whatsnew and API section (to resolve links in the whatsnew)pythonmake.pycleanpythonmake.py--whatsnew
For comparison, a full documentation build may take 15 minutes, but a singlesection may take 15 seconds. Subsequent builds, which only process portionsyou have changed, will be faster.
The build will automatically use the number of cores available on your machineto speed up the documentation build. You can override this:
pythonmake.pyhtml--num-jobs4
Open the following file in a web browser to see the full documentation youjust builtdoc/build/html/index.html.
And you’ll have the satisfaction of seeing your new and improved documentation!
Building main branch documentation#
When pull requests are merged into the pandasmain branch, the main parts ofthe documentation are also built by Travis-CI. These docs are then hostedhere, see alsotheContinuous Integration section.
Previewing changes#
Once, the pull request is submitted, GitHub Actions will automatically build thedocumentation. To view the built site:
Wait for the
CI/Webanddocscheck to complete.Click
Detailsnext to it.From the
Artifactsdrop-down, clickdocsorwebsiteto downloadthe site as a ZIP file.