data-apis/array-api-testsPublic

NotificationsYou must be signed in to change notification settings
Fork47
Star69

Test suite for Python array API standard compliance

License

MIT license

69 stars 47 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 1,629 Commits
.github/workflows		.github/workflows
array-api @ 772fb46		array-api @ 772fb46
array_api_tests		array_api_tests
meta_tests		meta_tests
.flake8		.flake8
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
_config.yml		_config.yml
array-api-strict-skips.txt		array-api-strict-skips.txt
conftest.py		conftest.py
pytest.ini		pytest.ini
reporting.py		reporting.py
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Repository files navigation

Test Suite for Array API Compliance

This is the test suite for array libraries adopting thePython Array APIstandard.

Keeping full coverage of the spec is an on-going priority as the Array API evolves.Feedback and contributions are welcome!

Quickstart

Setup

Currently we pin the Array API specification repoarray-apias a git submodule. This might change in the future to better support vendoringuse cases (see#107),but for now be sure submodules are pulled too, e.g.

$ git submodule update --init

To run the tests, install the testing dependencies.

$ pip install -r requirements.txt

Ensure you have the array library that you want to test installed.

Specifying the array module

You need to specify the array library to test. It can be specified via theARRAY_API_TESTS_MODULE environment variable, e.g.

$export ARRAY_API_TESTS_MODULE=array_api_strict

To specify a runtime-defined module, definexp using theexec('...') syntax:

$export ARRAY_API_TESTS_MODULE="exec('import quantity_array, numpy; xp = quantity_array.quantity_namespace(numpy)')"

Alternately, import/define thexp andxp_name variables inarray_api_tests/__init__.py.

Specifying the API version

You can specify the API version to use when testing via theARRAY_API_TESTS_VERSION environment variable, e.g.

$export ARRAY_API_TESTS_VERSION="2023.12"

Currently this defaults to the array module's__array_api_version__ value, andif that attribute doesn't exist then we fallback to"2021.12".

Run the suite

Simply runpytest against thearray_api_tests/ folder to run the full suite.

$ pytest array_api_tests/

The suite tries to logically organise its tests.pytest allows you to only runa specific test case, which is useful when developing functions.

$ pytest array_api_tests/test_creation_functions.py::test_zeros

What the test suite covers

We are interested in array libraries conforming to thespec.Ideally this means that if a library has fully adopted the Array API, the testsuite passes. We take great care tonot test things which are out-of-scope,so as to not unexpectedly fail the suite.

Primary tests

Every function—including array object methods—has a respective testmethod¹. We useHypothesisto generate a diverse set of valid inputs. This means array inputs will coverdifferent dtypes and shapes, as well as contain interesting elements. Theseexamples generate with interesting arrangements of non-array positionalarguments and keyword arguments.

Each test case will cover the following areas if relevant:

Smoking: We pass our generated examples to all functions. As theseexamples solely consist ofvalid inputs, we are testing that functions canbe called using their documented inputs without raising errors.
Data type: For functions returning/modifying arrays, we assert that outputarrays have the correct data types. Most functionstype-promoteinput arrays and some functions have bespoke rules—in both cases we simulatethe correct behaviour to find the expected data types.
Shape: For functions returning/modifying arrays, we assert that outputarrays have the correct shape. Most functionsbroadcastinput arrays and some functions have bespoke rules—in both cases we simulatethe correct behaviour to find the expected shapes.
Values: We assert output values (including the elements ofreturned/modified arrays) are as expected. Except for manipulation functionsor special cases, the spec allows floating-point inputs to have inexactoutputs, so with such examples we only assert values are roughly as expected.

Additional tests

In addition to having one test case for each function, we test other propertiesof the functions and some miscellaneous things.

Special cases: For functions with special case behaviour, we assert thatthese functions return the correct values.
Signatures: We assert functions have the correct signatures.
Constants: We assert thatconstantsbehave expectedly, are roughly the expected value, and that any relatedfunctions interact with them correctly.

Be aware that some aspects of the spec are impractical or impossible to actuallytest, so they are not covered in the suite.

Interpreting errors

First and foremost, note that most tests have to assume that certain aspects ofthe Array API have been correctly adopted, as fundamental APIs such as arraycreation and equalities are hard requirements for many assertions. This means atest case for one function might fail because another function has bugs or evenno implementation.

This means adopting libraries at first will result in a vast number of errorsdue to cascading errors. Generally the nature of the spec means many granulardetails such as type promotion is likely going to also fail nearly-conformingfunctions.

We hope to improve user experience in regards to "noisy" errors in#51. For now, if anerror message involves_UndefinedStub, it means an attribute of the arraylibrary (including functions) and it's objects (e.g. the array) is missing.

The spec is the suite's source of truth. If the suite appears to assumebehaviour different from the spec, or test something that is not documented,this is a bug—pleasereport suchissues to us.

Running on CI

See our existingGitHub Actions workflow forarray-api-strictfor an example of using the test suite on CI. Notearray-api-strictis an implementation of the array API that uses NumPy under the hood.

Releases

We recommend pinning against arelease tagwhen running on CI.

We usecalender versioning for the releases. You shouldexpect that any version may be "breaking" compared to the previous one, in thatnew tests (or improvements to existing tests) may cause a previously passinglibrary to fail.

Configuration

Data-dependent shapes

Use the--disable-data-dependent-shapes flag to skip testing functions which havedata-dependent shapes.

Extensions

By default, tests for the optional Array API extensions such aslinalgwill be skipped if not present in the specified array module. You can purposelyskip testing extension(s) via the--disable-extension option.

Skip or XFAIL test cases

Test cases you want to skip can be specified in a skips or XFAILS file. Thedifference between skip and XFAIL is that XFAIL tests are still run andreported as XPASS if they pass.

By default, the skips and xfails files areskips.txt andfails.txt in the rootof this repository, but any file can be specified with the--skips-file and--xfails-file command line flags.

The files should list the test ids to be skipped/xfailed. Empty lines andlines starting with# are ignored. The test id can be any substring of thetest ids to skip/xfail.

# skips.txt or xfails.txt# Line comments can be denoted with the hash symbol (#)# Skip specific test case, e.g. when argsort() does not respect relative order# https://github.com/numpy/numpy/issues/20778array_api_tests/test_sorting_functions.py::test_argsort# Skip specific test case parameter, e.g. you forgot to implement in-place addsarray_api_tests/test_add[__iadd__(x1, x2)]array_api_tests/test_add[__iadd__(x, s)]# Skip module, e.g. when your set functions treat NaNs as non-distinct# https://github.com/numpy/numpy/issues/20326array_api_tests/test_set_functions.py

Here is an example GitHub Actions workflow file, where the xfails are storedinarray-api-tests.xfails.txt in the base of theyour-array-library repo.

If you want, you can use-o xfail_strict=True, which causes XPASS tests (XFAILtests that actually pass) to fail the test suite. However, be aware thatXFAILures can be flaky (see below, so this may not be a good idea unless youuse some other mitigation of such flakyness).

If you don't want this behavior, you can remove it, or use--skips-fileinstead of--xfails-file.

# ./.github/workflows/array_api.ymljobs:tests:runs-on:ubuntu-lateststrategy:matrix:python-version:['3.8', '3.9', '3.10', '3.11']steps:    -name:Checkout <your array library>uses:actions/checkout@v3with:path:your-array-library    -name:Checkout array-api-testsuses:actions/checkout@v3with:repository:data-apis/array-api-testssubmodules:'true'path:array-api-tests    -name:Run the array API test suiteenv:ARRAY_API_TESTS_MODULE:your.array.api.namespacerun:|        export PYTHONPATH="${GITHUB_WORKSPACE}/your-array-library"        cd ${GITHUB_WORKSPACE}/array-api-tests        pytest -v -rxXfE --ci --xfails-file ${GITHUB_WORKSPACE}/your-array-library/array-api-tests-xfails.txt array_api_tests/

Warning
XFAIL tests that use Hypothesis (basically every test in the test suite exceptthose in test_has_names.py) can be flaky, due to the fact that Hypothesismight not always run the test with an input that causes the test to fail.There are several ways to avoid this problem:
Increase the maximum number of examples, e.g., by adding--max-examples 200 to the test command (the default is20, see below). This willmake it more likely that the failing case will be found, but it will alsomake the tests take longer to run.
Don't use-o xfail_strict=True. This will make it so that if an XFAILtest passes, it will alert you in the test summary but will not cause thetest run to register as failed.
Use skips instead of XFAILS. The difference between XFAIL and skip is thata skipped test is never run at all, whereas an XFAIL test is always runbut ignored if it fails.
Save theHypothesis examplesdatabasepersistently on CI. That way as soon as a run finds one failing example,it will always re-run future runs with that example. But note that theHypothesis examples database may be cleared when a new version ofHypothesis or the test suite is released.

Max examples

The tests make heavy useHypothesis. You can configurehow many examples are generated using the--max-examples flag, whichdefaults to20. Lower values can be useful for quick checks, and largervalues should result in more rigorous runs. For example,--max-examples 10_000 may find bugs where default runs don't but will take much longer torun.

Skipping Dtypes

The test suite will automatically skip testing of inessential dtypes if theyare not present on the array module namespace, but dtypes can also be skippedmanually by setting the environment variableARRAY_API_TESTS_SKIP_DTYPES toa comma separated list of dtypes to skip. For example

ARRAY_API_TESTS_SKIP_DTYPES=uint16,uint32,uint64 pytest array_api_tests/

Note that skipping certain essential dtypes such asbool and the defaultfloating-point dtype is not supported.

Turning xfails into skips

Keeping a large number ofxfails can have drastic effects on the run time. This is dueto the wayhypothesis works: when it detects a failure, it does a large amountof work to simplify the failing example.If the run time of the test suite becomes a problem, you can use theARRAY_API_TESTS_XFAIL_MARK environment variable: setting it toskip skips theentries from thexfail.txt file instead of xfailing them. Anecdotally, we sawspeed-ups by a factor of 4-5---which allowed us to use 4-5 larger values of--max-examples within the same time budget.

Limiting the array sizes

The test suite generates random arrays as inputs to functions it tests. "unvectorized"tests iterate over elements of arrays, which might be slow. If the run time becomesa problem, you can limit the maximum number of elements in generated arrays bysetting the environment variableARRAY_API_TESTS_MAX_ARRAY_SIZE to thedesired value. By default, it is set to 1024.

Contributing

Remain in-scope

It is important that every test only uses APIs that are part of the standard.For instance, when creating input arrays you should only use thearray creationfunctionsthat are documented in the spec. The same goes for testing arrays—you'll findmany utilities that parralel NumPy's own test utils in the*_helpers.py files.

Tools

Hypothesis should almost always be used for the primary tests, and can be usefulelsewhere. Effort should be made so drawn arguments are labeled with theirrespective names. Forst.data(),draws should be accompanied with thelabel kwarg i.e.data.draw(<strategy>, label=<label>).

pytest.mark.parametrizeshould be used to run tests over multiple arguments. Parameterization should bepreferred over using Hypothesis when there are a small number of possibleinputs, as this allows better failure reporting. Note using both parametrize andHypothesis for a single test method is possible and can be quite useful.

Error messages

Any assertion should be accompanied with a descriptive error message, includingthe relevant values. Error messages should be self-explanatory as to why a giventest fails, as one should not need prior knowledge of how the test isimplemented.

Generated files

Some files in the suite are automatically generated from the spec, and shouldnot be edited directly. To regenerate these files, run the script

./generate_stubs.py path/to/array-api

wherepath/to/array-api is the path to a local clone of thearray-apirepo. Editgenerate_stubs.py to makechanges to the generated files.

Release

To make a release, first make an annotated tag with the version, e.g.:

git tag -a 2022.01.01

Be sure to use the calver version number for the tag name. Don't worry too muchon the tag message, e.g. just write "2022.01.01".

Versioneer will automatically set the version number of thearray_api_testspackage based on the git tag. Push the tag to GitHub:

git push --tags upstream 2022.1

Then go to thetags page onGitHub and convert the taginto a release. If you want, you can add release notes, which GitHub cangenerate for you.

¹The only exceptions to having just one primary test per function are:

asarray(),which is tested bytest_asarray_scalars andtest_asarray_arrays intest_creation_functions.py. Testingasarray() works with scalars (andnested sequences of scalars) is fundamental to testing that it works witharrays, as said arrays can only be generated by passing scalar sequences toasarray().
Indexing methods(__getitem__()and__setitem__()),which respectively have both a test for non-array indices and a test forboolean array indices. This is becausemasking isopt-in(and boolean arrays need to be generated by indexing arrays anyway).

About

Test suite for Python array API standard compliance

data-apis.org/array-api-tests/

Releases8

2022.09.30 Latest

Sep 30, 2022

+ 7 releases

Packages

No packages published

Contributors18

+ 4 contributors

Languages

Python100.0%

Movatterモバイル変換

License

data-apis/array-api-tests

Folders and files

Latest commit

History

Repository files navigation

Test Suite for Array API Compliance

Quickstart

Setup

Specifying the array module

Specifying the API version

Run the suite

What the test suite covers

Primary tests

Additional tests

Interpreting errors

Running on CI

Releases

Configuration

Data-dependent shapes

Extensions

Skip or XFAIL test cases

Max examples

Skipping Dtypes

Turning xfails into skips

Limiting the array sizes

Contributing

Remain in-scope

Tools

Error messages

Generated files

Release

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases8

Packages0

Contributors18

Uh oh!

Languages

Packages