Contributing to the code base#
Code standards#
Writing good code is not just about what you write. It is also abouthow youwrite it. DuringContinuous Integration testing, severaltools will be run to check your code for stylistic errors.Generating any warnings will cause the test to fail.Thus, good style is a requirement for submitting code to pandas.
There are a couple of tools in pandas to help contributors verify their changesbefore contributing to the project
./ci/code_checks.sh
: a script validates the doctests, formatting in docstrings,and imported modules. It is possible to run the checks independently by using theparametersdocstrings
,code
, anddoctests
(e.g../ci/code_checks.shdoctests
);pre-commit
, which we go into detail on in the next section.
In addition, because a lot of people use our library, it is important that wedo not make sudden changes to the code that could have the potential to breaka lot of user code as a result, that is, we need it to be asbackwards compatibleas possible to avoid mass breakages.
Pre-commit#
Additionally,Continuous Integration will run code formatting checkslikeruff
,isort
, andclang-format
and more usingpre-commit hooks.Any warnings from these checks will cause theContinuous Integration to fail; therefore,it is helpful to run the check yourself before submitting code. Thiscan be done by installingpre-commit
(which should already have happened if you followed the instructionsinSetting up your development environment) and then running:
pre-commitinstall
from the root of the pandas repository. Now all of the styling checks will berun each time you commit changes without your needing to run each one manually.In addition, usingpre-commit
will also allow you to more easilyremain up-to-date with our code checks as they change.
Note that if needed, you can skip these checks withgitcommit--no-verify
.
If you don’t want to usepre-commit
as part of your workflow, you can still use itto run its checks with one of the following:
pre-commitrun--files<filesyouhavemodified>pre-commitrun--from-ref=upstream/main--to-ref=HEAD--all-files
without needing to have donepre-commitinstall
beforehand.
Finally, we also have some slow pre-commit checks, which don’t run on each commitbut which do run during continuous integration. You can trigger them manually with:
pre-commitrun--hook-stagemanual--all-files
Note
You may want to periodically runpre-commitgc
, to clean up reposwhich are no longer used.
Note
If you have conflicting installations ofvirtualenv
, then you may get anerror - seehere.
Also, due to abug in virtualenv,you may run into issues if you’re using conda. To solve this, you can downgradevirtualenv
to version20.0.33
.
Note
If you have recently merged in main from the upstream branch, some of thedependencies used bypre-commit
may have changed. Make sure toupdate your development environment.
Optional dependencies#
Optional dependencies (e.g. matplotlib) should be imported with the private helperpandas.compat._optional.import_optional_dependency
. This ensures aconsistent error message when the dependency is not met.
All methods using an optional dependency should include a test asserting that anImportError
is raised when the optional dependency is not found. This testshould be skipped if the library is present.
All optional dependencies should be documented inOptional dependencies and the minimum required version should beset in thepandas.compat._optional.VERSIONS
dict.
Backwards compatibility#
Please try to maintain backward compatibility. pandas has lots of users with lots ofexisting code, so don’t break it if at all possible. If you think breakage is required,clearly state why as part of the pull request. Also, be careful when changing methodsignatures and add deprecation warnings where needed. Also, add the deprecated sphinxdirective to the deprecated functions or methods.
If a function with the same arguments as the one being deprecated exist, you can usethepandas.util._decorators.deprecate
:
frompandas.util._decoratorsimportdeprecatedeprecate('old_func','new_func','1.1.0')
Otherwise, you need to do it manually:
importwarningsfrompandas.util._exceptionsimportfind_stack_leveldefold_func():"""Summary of the function. .. deprecated:: 1.1.0 Use new_func instead. """warnings.warn('Use new_func instead.',FutureWarning,stacklevel=find_stack_level(),)new_func()defnew_func():pass
You’ll also need to
Write a new test that asserts a warning is issued when calling with the deprecated argument
Update all of pandas existing tests and code to use the new argument
SeeTesting a warning for more.
Type hints#
pandas strongly encourages the use ofPEP 484 style type hints. New development should contain type hints and pull requests to annotate existing code are accepted as well!
Style guidelines#
Type imports should follow thefromtypingimport...
convention.Your code may be automatically re-written to use some modern constructs (e.g. using the built-inlist
instead oftyping.List
)by thepre-commit checks.
In some cases in the code base classes may define class variables that shadow builtins. This causes an issue as described inMypy 1775. The defensive solution here is to create an unambiguous alias of the builtin and use that without your annotation. For example, if you come across a definition like
classSomeClass1:str=None
The appropriate way to annotate this would be as follows
str_type=strclassSomeClass2:str:str_type=None
In some cases you may be tempted to usecast
from the typing module when you know better than the analyzer. This occurs particularly when using custom inference functions. For example
fromtypingimportcastfrompandas.core.dtypes.commonimportis_numberdefcannot_infer_bad(obj:Union[str,int,float]):ifis_number(obj):...else:# Reasonably only str objects would reach this but...obj=cast(str,obj)# Mypy complains without this!returnobj.upper()
The limitation here is that while a human can reasonably understand thatis_number
would catch theint
andfloat
types mypy cannot make that same inference just yet (seemypy #5206). While the above works, the use ofcast
isstrongly discouraged. Where applicable a refactor of the code to appease static analysis is preferable
defcannot_infer_good(obj:Union[str,int,float]):ifisinstance(obj,str):returnobj.upper()else:...
With custom types and inference this is not always possible so exceptions are made, but every effort should be exhausted to avoidcast
before going down such paths.
pandas-specific types#
Commonly used types specific to pandas will appear inpandas._typing and you should use these where applicable. This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas.
For example, quite a few functions in pandas accept adtype
argument. This can be expressed as a string like"object"
, anumpy.dtype
likenp.int64
or even a pandasExtensionDtype
likepd.CategoricalDtype
. Rather than burden the user with having to constantly annotate all of those options, this can simply be imported and reused from the pandas._typing module
frompandas._typingimportDtypedefas_type(dtype:Dtype)->...:...
This module will ultimately house types for repeatedly used concepts like “path-like”, “array-like”, “numeric”, etc… and can also hold aliases for commonly appearing parameters likeaxis
. Development of this module is active so be sure to refer to the source for the most up to date list of available types.
Validating type hints#
pandas usesmypy andpyright to statically analyze the code base and type hints. After making any change you can ensure your type hints are consistent by running
pre-commitrun--hook-stagemanual--all-filesmypypre-commitrun--hook-stagemanual--all-filespyrightpre-commitrun--hook-stagemanual--all-filespyright_reportGeneralTypeIssues# the following might fail if the installed pandas version does not correspond to your local git versionpre-commitrun--hook-stagemanual--all-filesstubtest
in your python environment.
Warning
Please be aware that the above commands will use the current python environment. If your python packages are older/newer than those installed by the pandas CI, the above commands might fail. This is often the case when the
mypy
ornumpy
versions do not match. Please seehow to setup the python environment or select arecently succeeded workflow, select the “Docstring validation, typing, and other manual pre-commit hooks” job, then click on “Set up Conda” and “Environment info” to see which versions the pandas CI installs.
Testing type hints in code using pandas#
Warning
pandas is not yet a py.typed library (PEP 561)!The primary purpose of locally declaring pandas as a py.typed library is to test andimprove the pandas-builtin type annotations.
Until pandas becomes a py.typed library, it is possible to easily experiment with the typeannotations shipped with pandas by creating an empty file named “py.typed” in the pandasinstallation folder:
python -c "import pandas; import pathlib; (pathlib.Path(pandas.__path__[0]) / 'py.typed').touch()"
The existence of the py.typed file signals to type checkers that pandas is already a py.typedlibrary. This makes type checkers aware of the type annotations shipped with pandas.
Testing with continuous integration#
The pandas test suite will run automatically onGitHub Actionscontinuous integration services, once your pull request is submitted.However, if you wish to run the test suite on a branch prior to submitting the pull request,then the continuous integration services need to be hooked to your GitHub repository. Instructions are hereforGitHub Actions.
A pull-request will be considered for merging when you have an all ‘green’ build. If any tests are failing,then you will get a red ‘X’, where you can click through to see the individual failed tests.This is an example of a green build.

Test-driven development#
pandas is serious about testing and strongly encourages contributors to embracetest-driven development (TDD).This development process “relies on the repetition of a very short development cycle:first the developer writes an (initially failing) automated test case that defines a desiredimprovement or new function, then produces the minimum amount of code to pass that test.”So, before actually writing any code, you should write your tests. Often the test can betaken from the original GitHub issue. However, it is always worth considering additionaluse cases and writing corresponding tests.
We usecode coverage to help understandthe amount of code which is covered by a test. We recommend striving to ensure codeyou add or change within Pandas is covered by a test. Please see ourcode coverage dashboard through Codecovfor more information.
Adding tests is one of the most common requests after code is pushed to pandas. Therefore,it is worth getting in the habit of writing tests ahead of time so this is never an issue.
Writing tests#
All tests should go into thetests
subdirectory of the specific package.This folder contains many current examples of tests, and we suggest looking to these forinspiration.
As a general tip, you can use the search functionality in your integrated developmentenvironment (IDE) or the git grep command in a terminal to find test files in which the methodis called. If you are unsure of the best location to put your test, take your best guess,but note that reviewers may request that you move the test to a different location.
To use git grep, you can run the following command in a terminal:
gitgrep"function_name("
This will search through all files in your repository for the textfunction_name(
.This can be a useful way to quickly locate the function in thecodebase and determine the best location to add a test for it.
Ideally, there should be one, and only one, obvious place for a test to reside.Until we reach that ideal, these are some rules of thumb for where a test shouldbe located.
Does your test depend only on code in
pd._libs.tslibs
?This test likely belongs in one of:tests.tslibs
Note
No file in
tests.tslibs
should import from any pandas modulesoutside ofpd._libs.tslibs
tests.scalar
tests.tseries.offsets
Does your test depend only on code in
pd._libs
?This test likely belongs in one of:tests.libs
tests.groupby.test_libgroupby
Is your test for an arithmetic or comparison method?This test likely belongs in one of:
tests.arithmetic
Note
These are intended for tests that can be shared to test the behaviorof DataFrame/Series/Index/ExtensionArray using the
box_with_array
fixture.tests.frame.test_arithmetic
tests.series.test_arithmetic
Is your test for a reduction method (min, max, sum, prod, …)?This test likely belongs in one of:
tests.reductions
Note
These are intended for tests that can be shared to test the behaviorof DataFrame/Series/Index/ExtensionArray.
tests.frame.test_reductions
tests.series.test_reductions
tests.test_nanops
Is your test for an indexing method?This is the most difficult case for deciding where a test belongs, becausethere are many of these tests, and many of them test more than one method(e.g. both
Series.__getitem__
andSeries.loc.__getitem__
)Is the test specifically testing an Index method (e.g.
Index.get_loc
,Index.get_indexer
)?This test likely belongs in one of:tests.indexes.test_indexing
tests.indexes.fooindex.test_indexing
Within that files there should be a method-specific test class e.g.
TestGetLoc
.In most cases, neither
Series
norDataFrame
objects should beneeded in these tests.Is the test for a Series or DataFrame indexing methodother than
__getitem__
or__setitem__
, e.g.xs
,where
,take
,mask
,lookup
, orinsert
?This test likely belongs in one of:tests.frame.indexing.test_methodname
tests.series.indexing.test_methodname
Is the test for any of
loc
,iloc
,at
, oriat
?This test likely belongs in one of:tests.indexing.test_loc
tests.indexing.test_iloc
tests.indexing.test_at
tests.indexing.test_iat
Within the appropriate file, test classes correspond to either types ofindexers (e.g.
TestLocBooleanMask
) or major use cases(e.g.TestLocSetitemWithExpansion
).See the note in section D) about tests that test multiple indexing methods.
Is the test for
Series.__getitem__
,Series.__setitem__
,DataFrame.__getitem__
, orDataFrame.__setitem__
?This test likely belongs in one of:tests.series.test_getitem
tests.series.test_setitem
tests.frame.test_getitem
tests.frame.test_setitem
If many cases such a test may test multiple similar methods, e.g.
importpandasaspdimportpandas._testingastmdeftest_getitem_listlike_of_ints():ser=pd.Series(range(5))result=ser[[3,4]]expected=pd.Series([2,3])tm.assert_series_equal(result,expected)result=ser.loc[[3,4]]tm.assert_series_equal(result,expected)
In cases like this, the test location should be based on theunderlyingmethod being tested. Or in the case of a test for a bugfix, the locationof the actual bug. So in this example, we know that
Series.__getitem__
callsSeries.loc.__getitem__
, so this isreally a test forloc.__getitem__
. So this test belongs intests.indexing.test_loc
.Is your test for a DataFrame or Series method?
Is the method a plotting method?This test likely belongs in one of:
tests.plotting
Is the method an IO method?This test likely belongs in one of:
tests.io
Note
This includes
to_string
but excludes__repr__
, which istested intests.frame.test_repr
andtests.series.test_repr
.Other classes often have atest_formats
file.
OtherwiseThis test likely belongs in one of:
tests.series.methods.test_mymethod
tests.frame.methods.test_mymethod
Note
If a test can be shared between DataFrame/Series using the
frame_or_series
fixture, by convention it goes in thetests.frame
file.
Is your test for an Index method, not depending on Series/DataFrame?This test likely belongs in one of:
tests.indexes
Is your test for one of the pandas-provided ExtensionArrays (
Categorical
,DatetimeArray
,TimedeltaArray
,PeriodArray
,IntervalArray
,NumpyExtensionArray
,FloatArray
,BoolArray
,StringArray
)?This test likely belongs in one of:tests.arrays
Is your test forall ExtensionArray subclasses (the “EA Interface”)?This test likely belongs in one of:
tests.extension
Usingpytest
#
Test structure#
pandas existing test structure ismostly class-based, meaning that you will typically find tests wrapped in a class.
classTestReallyCoolFeature:deftest_cool_feature_aspect(self):pass
We prefer a morefunctional style using thepytest framework, which offers a richer testingframework that will facilitate testing and developing. Thus, instead of writing test classes, we will write test functions like this:
deftest_really_cool_feature():pass
Preferredpytest
idioms#
Functional tests named
deftest_*
andonly take arguments that are either fixtures or parameters.Use a bare
assert
for testing scalars and truth-testingUse
tm.assert_series_equal(result,expected)
andtm.assert_frame_equal(result,expected)
for comparingSeries
andDataFrame
results respectively.Use@pytest.mark.parameterize when testing multiple cases.
Usepytest.mark.xfail when a test case is expected to fail.
Usepytest.mark.skip when a test case is never expected to pass.
Usepytest.param when a test case needs a particular mark.
Use@pytest.fixture if multiple tests can share a setup object.
Warning
Do not usepytest.xfail
(which is different thanpytest.mark.xfail
) since it immediately stops thetest and does not check if the test will fail. If this is the behavior you desire, usepytest.skip
instead.
If a test is known to fail but the manner in which it failsis not meant to be captured, usepytest.mark.xfail
. It is common to use this method for a test thatexhibits buggy behavior or a non-implemented feature. Ifthe failing test has flaky behavior, use the argumentstrict=False
. Thiswill make it so pytest does not fail if the test happens to pass. Usingstrict=False
is highly undesirable, please use it only as a last resort.
Prefer the decorator@pytest.mark.xfail
and the argumentpytest.param
over usage within a test so that the test is appropriately marked during thecollection phase of pytest. For xfailing a test that involves multipleparameters, a fixture, or a combination of these, it is only possible toxfail during the testing phase. To do so, use therequest
fixture:
deftest_xfail(request):mark=pytest.mark.xfail(raises=TypeError,reason="Indicate why here")request.applymarker(mark)
xfail is not to be used for tests involving failure due to invalid user arguments.For these tests, we need to verify the correct exception type and error messageis being raised, usingpytest.raises
instead.
Testing a warning#
Usetm.assert_produces_warning
as a context manager to check that a block of code raises a warningand specify the warning message using thematch
argument.
withtm.assert_produces_warning(DeprecationWarning,match="the warning message"):pd.deprecated_function()
If a warning should specifically not happen in a block of code, passFalse
into the context manager.
withtm.assert_produces_warning(False):pd.no_warning_function()
If you have a test that would emit a warning, but you aren’t actually testing thewarning itself (say because it’s going to be removed in the future, or because we’rematching a 3rd-party library’s behavior), then usepytest.mark.filterwarnings
toignore the error.
@pytest.mark.filterwarnings("ignore:msg:category")deftest_thing(self):pass
Testing an exception#
Usepytest.raises as a context managerwith the specific exception subclass (i.e. never useException
) and the exception message inmatch
.
withpytest.raises(ValueError,match="an error"):raiseValueError("an error")
Testing involving files#
Thetemp_file
pytest fixture creates a temporary filePathlib
object for testing:
deftest_something(temp_file):pd.DataFrame([1]).to_csv(str(temp_file))
Please referencepytest’s documentationfor the file retention policy.
Testing involving network connectivity#
A unit test should not access a public data set over the internet due to flakiness of network connections andlack of ownership of the server that is being connected to. To mock this interaction, use thehttpserver
fixture from thepytest-localserver plugin. with synthetic data.
@pytest.mark.network@pytest.mark.single_cpudeftest_network(httpserver):httpserver.serve_content(content="content")result=pd.read_html(httpserver.url)
Example#
Here is an example of a self-contained set of tests in a filepandas/tests/test_cool_feature.py
that illustrate multiple features that we like to use. Please remember to add the GitHub Issue Numberas a comment to a new test.
importpytestimportnumpyasnpimportpandasaspd@pytest.mark.parametrize('dtype',['int8','int16','int32','int64'])deftest_dtypes(dtype):assertstr(np.dtype(dtype))==dtype@pytest.mark.parametrize('dtype',['float32',pytest.param('int16',marks=pytest.mark.skip),pytest.param('int32',marks=pytest.mark.xfail(reason='to show how it works'))])deftest_mark(dtype):assertstr(np.dtype(dtype))=='float32'@pytest.fixturedefseries():returnpd.Series([1,2,3])@pytest.fixture(params=['int8','int16','int32','int64'])defdtype(request):returnrequest.paramdeftest_series(series,dtype):# GH <issue_number>result=series.astype(dtype)assertresult.dtype==dtypeexpected=pd.Series([1,2,3],dtype=dtype)tm.assert_series_equal(result,expected)
A test run of this yields
((pandas)bash-3.2$pytesttest_cool_feature.py-v===========================testsessionstarts===========================platformdarwin--Python3.6.2,pytest-3.6.0,py-1.4.31,pluggy-0.4.0collected11itemstester.py::test_dtypes[int8]PASSEDtester.py::test_dtypes[int16]PASSEDtester.py::test_dtypes[int32]PASSEDtester.py::test_dtypes[int64]PASSEDtester.py::test_mark[float32]PASSEDtester.py::test_mark[int16]SKIPPEDtester.py::test_mark[int32]xfailtester.py::test_series[int8]PASSEDtester.py::test_series[int16]PASSEDtester.py::test_series[int32]PASSEDtester.py::test_series[int64]PASSED
Tests that we haveparametrized
are now accessible via the test name, for example we could run these with-kint8
to sub-selectonly those tests which matchint8
.
((pandas)bash-3.2$pytesttest_cool_feature.py-v-kint8===========================testsessionstarts===========================platformdarwin--Python3.6.2,pytest-3.6.0,py-1.4.31,pluggy-0.4.0collected11itemstest_cool_feature.py::test_dtypes[int8]PASSEDtest_cool_feature.py::test_series[int8]PASSED
Usinghypothesis
#
Hypothesis is a library for property-based testing. Instead of explicitlyparametrizing a test, you can describeall valid inputs and let Hypothesistry to find a failing input. Even better, no matter how many random examplesit tries, Hypothesis always reports a single minimal counterexample to yourassertions - often an example that you would never have thought to test.
SeeGetting Started with Hypothesisfor more of an introduction, thenrefer to the Hypothesis documentationfor details.
importjsonfromhypothesisimportgiven,strategiesasstany_json_value=st.deferred(lambda:st.one_of(st.none(),st.booleans(),st.floats(allow_nan=False),st.text(),st.lists(any_json_value),st.dictionaries(st.text(),any_json_value)))@given(value=any_json_value)deftest_json_roundtrip(value):result=json.loads(json.dumps(value))assertvalue==result
This test shows off several useful features of Hypothesis, as well asdemonstrating a good use-case: checking properties that should hold overa large or complicated domain of inputs.
To keep the pandas test suite running quickly, parametrized tests arepreferred if the inputs or logic are simple, with Hypothesis tests reservedfor cases with complex logic or where there are too many combinations ofoptions or subtle interactions to test (or think of!) all of them.
Running the test suite#
The tests can then be run directly inside your Git clone (without having toinstall pandas) by typing:
pytestpandas
Note
If a handful of tests don’t pass, it may not be an issue with your pandas installation.Some tests (e.g. some SQLAlchemy ones) require additional setup, others might startfailing because a non-pinned library released a new version, and others might be flakyif run in parallel. As long as you can import pandas from your locally built version,your installation is probably fine and you can start contributing!
Often it is worth running only a subset of tests first around your changes before running theentire suite.
The easiest way to do this is with:
pytestpandas/path/to/test.py-kregex_matching_test_name
Or with one of the following constructs:
pytestpandas/tests/[test-module].pypytestpandas/tests/[test-module].py::[TestClass]pytestpandas/tests/[test-module].py::[TestClass]::[test_method]
Usingpytest-xdist, which isincluded in our ‘pandas-dev’ environment, one can speed up local testing onmulticore machines. The-n
number flag then can be specified when runningpytest to parallelize a test run across the number of specified cores or auto toutilize all the available cores on your machine.
# Utilize 4 corespytest-n4pandas# Utilizes all available corespytest-nautopandas
If you’d like to speed things along further a more advanced use of thiscommand would look like this
pytestpandas-n4-m"not slow and not network and not db and not single_cpu"-rsxX
In addition to the multithreaded performance increase this improves testspeed by skipping some tests using the-m
mark flag:
slow: any test taking long (think seconds rather than milliseconds)
network: tests requiring network connectivity
db: tests requiring a database (mysql or postgres)
single_cpu: tests that should run on a single cpu only
You might want to enable the following option if it’s relevant for you:
arm_slow: any test taking long on arm64 architecture
These markers are definedin this toml file, under[tool.pytest.ini_options]
in a list calledmarkers
, in caseyou want to check if new ones have been created which are of interest to you.
The-r
report flag will display a short summary info (seepytestdocumentation). Here we are displaying the number of:
s: skipped tests
x: xfailed tests
X: xpassed tests
The summary is optional and can be removed if you don’t need the addedinformation. Using the parallelization option can significantly reduce thetime it takes to locally run tests before submitting a pull request.
If you require assistance with the results,which has happened in the past, please set a seed before running the commandand opening a bug report, that way we can reproduce it. Here’s an examplefor setting a seed on windows
setPYTHONHASHSEED=314159265pytestpandas-n4-m"not slow and not network and not db and not single_cpu"-rsxX
On Unix use
exportPYTHONHASHSEED=314159265pytestpandas-n4-m"not slow and not network and not db and not single_cpu"-rsxX
For more, see thepytest documentation.
Furthermore one can run
pd.test()
with an imported pandas to run tests similarly.
Running the performance test suite#
Performance matters and it is worth considering whether your code has introducedperformance regressions. pandas is in the process of migrating toasv benchmarksto enable easy monitoring of the performance of critical pandas operations.These benchmarks are all found in thepandas/asv_bench
directory, and thetest results can be foundhere.
To use all features of asv, you will need eitherconda
orvirtualenv
. For more details please check theasv installationwebpage.
To install asv:
pipinstallgit+https://github.com/airspeed-velocity/asv
If you need to run a benchmark, change your directory toasv_bench/
and run:
asvcontinuous-f1.1upstream/mainHEAD
You can replaceHEAD
with the name of the branch you are working on,and report benchmarks that changed by more than 10%.The command usesconda
by default for creating the benchmarkenvironments. If you want to use virtualenv instead, write:
asvcontinuous-f1.1-Evirtualenvupstream/mainHEAD
The-Evirtualenv
option should be added to allasv
commandsthat run benchmarks. The default value is defined inasv.conf.json
.
Running the full benchmark suite can be an all-day process, depending on yourhardware and its resource utilization. However, usually it is sufficient to pasteonly a subset of the results into the pull request to show that the committed changesdo not cause unexpected performance regressions. You can run specific benchmarksusing the-b
flag, which takes a regular expression. For example, this willonly run benchmarks from apandas/asv_bench/benchmarks/groupby.py
file:
asvcontinuous-f1.1upstream/mainHEAD-b^groupby
If you want to only run a specific group of benchmarks from a file, you can do itusing.
as a separator. For example:
asvcontinuous-f1.1upstream/mainHEAD-bgroupby.GroupByMethods
will only run theGroupByMethods
benchmark defined ingroupby.py
.
You can also run the benchmark suite using the version ofpandas
already installed in your current Python environment. This can beuseful if you do not have virtualenv or conda, or are using thesetup.pydevelop
approach discussed above; for the in-place buildyou need to setPYTHONPATH
, e.g.PYTHONPATH="$PWD/.."asv[remainingarguments]
.You can run benchmarks using an existing Pythonenvironment by:
asvrun-e-Eexisting
or, to use a specific Python interpreter,:
asvrun-e-Eexisting:python3.6
This will display stderr from the benchmarks, and use your localpython
that comes from your$PATH
.
Information on how to write a benchmark and how to use asv can be found in theasv documentation.
Documenting your code#
Changes should be reflected in the release notes located indoc/source/whatsnew/vx.y.z.rst
.This file contains an ongoing change log for each release. Add an entry to this file todocument your fix, enhancement or (unavoidable) breaking change. Make sure to include theGitHub issue number when adding your entry (using:issue:`1234`
where1234
is theissue/pull request number). Your entry should be written using full sentences and propergrammar.
When mentioning parts of the API, use a Sphinx:func:
,:meth:
, or:class:
directive as appropriate. Not all public API functions and methods have adocumentation page; ideally links would only be added if they resolve. You canusually find similar examples by checking the release notes for one of the previousversions.
If your code is a bugfix, add your entry to the relevant bugfix section. Avoidadding to theOther
section; only in rare cases should entries go there.Being as concise as possible, the description of the bug should include how theuser may encounter it and an indication of the bug itself, e.g.“produces incorrect results” or “incorrectly raises”. It may be necessary to alsoindicate the new behavior.
If your code is an enhancement, it is most likely necessary to add usageexamples to the existing documentation. This can be done following the sectionregardingdocumentation.Further, to let users know when this feature was added, theversionadded
directive is used. The sphinx syntax for that is:
..versionadded:: 2.1.0
This will put the textNew in version 2.1.0 wherever you put the sphinxdirective. This should also be put in the docstring when adding a new functionor method (example)or a new keyword argument (example).