Contributing#
Getting Started#
If you’re interested in getting involved in the development of Modin, but aren’t surewhere start, take a look at the issues taggedGood first issue orDocumentation.These are issues that would be good for getting familiar with the codebase and betterunderstanding some of the more complex components of the architecture. There isdocumentation here about thearchitecture that you willwant to review in order to get started.
Also, feel free to join the discussions on thedeveloper mailing list.
If you want a quick guide to getting your development environment setup, pleaseusethe contributing instructions on GitHub.
Certificate of Origin#
To keep a clear track of who did what, we use asign-off procedure (same requirementsfor using the signed-off-by process as the Linux kernel hashttps://www.kernel.org/doc/html/v4.17/process/submitting-patches.html) on patches or pullrequests that are being sent. The sign-off is a simple line at the end of the explanationfor the patch, which certifies that you wrote it or otherwise have the right to pass iton as an open-source patch. The rules are pretty simple: if you can certify the below:
CERTIFICATE OF ORIGIN V 1.1#
“By making a contribution to this project, I certify that:
1.) The contribution was created in whole or in part by me and I have the right tosubmit it under the open source license indicated in the file; or2.) The contribution is based upon previous work that, to the best of my knowledge, iscovered under an appropriate open source license and I have the right under that licenseto submit that work with modifications, whether created in whole or in part by me, underthe same open source license (unless I am permitted to submit under a differentlicense), as indicated in the file; or3.) The contribution was provided directly to me by some other person who certified (a),(b) or (c) and I have not modified it.4.) I understand and agree that this project and the contribution are public and that arecord of the contribution (including all personal information I submit with it,including my sign-off) is maintained indefinitely and may be redistributed consistentwith this project or the open source license(s) involved.”
ThisismycommitmessageSigned-off-by:AwesomeDeveloper<developer@example.org>
Code without a proper signoff cannot be merged into themain branch. Note: You must use your real name (sorry, no pseudonyms or anonymouscontributions.)
The text can either be manually added to your commit body, or you can add either-sor--signoff to your usualgitcommit commands:
gitcommit--signoff-m"This is my commit message"gitcommit-s-m"This is my commit message"
This will use your default git configuration which is found in .git/config. To changethis, you can use the following commands:
gitconfig--globaluser.name"Awesome Developer"gitconfig--globaluser.email"awesome.developer.@example.org"
If you have authored a commit that is missing the signed-off-by line, you can amend yourcommits and push them to GitHub.
gitcommit--amend--signoff
If you’ve pushed your changes to GitHub already you’ll need to force push your branchafter this withgitpush-f.
Commit Message formatting#
We request that your first commit follow a particular format, and werequire that your PR title follow the format. The format is:
FEAT-#9999:Add`DataFrame.rolling`functionality,toenablerollingwindowoperations
TheFEAT component represents the type of commit. This component of the commitmessage can be one of the following:
FEAT: A new feature that is added
DOCS: Documentation improvements or updates
FIX: A bugfix contribution
REFACTOR: Moving or removing code without change in functionality
TEST: Test updates or improvements
PERF: Performance enhancements
The#9999 component of the commit message should be the issue number in the ModinGitHub issue tracker:modin-project/modin#issues. This is importantbecause it links commits to their issues.
The commit message should follow a colon (:) and be descriptive and succinct.
A Modin CI job on GitHub will enforce that your pull request title follows theformat we suggest. Note that if you update the PR title, you have to pushanother commit (even if it’s empty) or amend your last commit for the job topick up the new PR title. Re-running the job in Github Actions won’t work.
General Rules for committers#
Try to write a PR name as descriptive as possible.
Try to keep PRs as small as possible. One PR should be making one semantically atomic change.
Don’t merge your own PRs even if you are technically able to do it.
Development Dependencies#
We recommend doing development in a virtualenv or conda environment, though this decisionis ultimately yours. You will want to run the following in order to install all of the requireddependencies for running the tests and formatting the code:
condaenvcreate--fileenvironment-dev.yml# orpipinstall-rrequirements-dev.txtCode Formatting and Lint#
We useblack for code formatting. Before you submit a pull request, please make surethat you run the following from the project root:
blackmodin/asv_bench/benchmarksscripts/doc_checker.py
We also useflake8 to check linting errors. Running the following from the project rootwill ensure that it passes the lint checks on Github Actions:
flake8modin/asv_bench/benchmarksscripts/doc_checker.py
We test that this has been run on ourGithub Actions test suite. If you do this and findthat the tests are still failing, try updating your version of black and flake8.
Adding a test#
If you find yourself fixing a bug or adding a new feature, don’t forget to add a test tothe test suite to verify its correctness! More on testing and the layout of the testscan be found in our testing documentation. We ask that you follow the existingstructure of the tests for ease of maintenance.
Running the tests#
To run the entire test suite, run the following from the project root:
pytestmodin/pandas/test
The test suite is very large, and may take a long time if you run every test. If you’veonly modified a small amount of code, it may be sufficient to run a single test or somesubset of the test suite. In order to run a specific test run:
pytestmodin/pandas/test::test_new_functionality
The entire test suite is automatically run for each pull request.
Performance measurement#
We useAsv tool for performance tracking of various Modin functionality. The resultscan be viewed here:Asv dashboard.
More information can be found in theAsv readme.
Building documentation#
To build the documentation, please follow the steps below from the project root:
pipinstall-rdocs/requirements-doc.txtsphinx-build-bhtmldocsdocs/build
To visualize the documentation locally, run the following frombuild folder:
python-mhttp.server<port># python -m http.server 1234then open the browser at0.0.0.0:<port> (e.g.0.0.0.0:1234).
Contributing a new execution framework or in-memory format#
If you are interested in contributing support for a new execution framework or in-memoryformat, please make sure you understand thearchitecture of Modin.
The best place to start the discussion for adding a new execution framework or in-memoryformat is thedeveloper mailing list.
More docs on this coming soon…
