Lifecycle of a pull request#

As mentioned before, the Arrow project uses Gitfor version control and a workflow based on pull requests. Thatmeans that you contribute the changes, or “patches”, to the codeby creating a branch in Git, make changes to the code, push thechanges to yourorigin which is your fork of the Arrowrepository on GitHub and then you create apull request againstthe official Arrow repository which is saved in your set up asupstream.

You should have Git set up by now, have cloned the repository,have successfully built Arrow and have a GitHub issue to work on.

Before making changes to the code, you should create a newbranch in Git.

  1. Update your fork’s main branch withupstream/main.Run this in the shell from thearrow directory.

    $gitcheckoutmain# select the main Arrow branch$gitfetchupstream# check for changes in upstream/main$gitpull--ff-onlyupstreammain# save the changes from upstream/main

    Note:--ff-only applies changes only if they can be fast-forwardedwithout conflicts or creating merge commits.

  2. Create a new branch

    $gitcheckout-b<branch-name>

    or (does the same thing)

    $gitswitch--create<branch-name>

Now you can make changes to the code. To see the changesmade in the library use this two commands:

$gitstatus# to see what files are changed$gitdiff# to see code change per file

Creating a pull request#

Once you are satisfied with the changes, run thetestsandlinters and then go ahead and commit the changes.

  1. Add and commit the changes

    $gitadd<filenames>$gitcommit-m"<message>"

    Alternatively, you can add and commit in one step, if all the files changedare to be committed (-a to add all, -m for message)

    $gitcommit-am"<message>"
  2. Then push your work to your Arrow fork

    $gitpushorigin<branch-name>

Note

Your work is now still under your watchful eye so it’s not a problemif you see any errors you would like to correct. You can make anadditional commit to correct, and Git has lots of ways toamend, delete, revise, etc. Seehttps://git-scm.com/docs for moreinformation.

Until you make the pull request, nothing is visible on the Arrowrepository and you are free to experiment.

If all is set, you can make the pull request!

  1. Go tohttps://github.com/<yourusername>/arrow where you will see a box withthe name of the branch that you pushed and next to it a green buttonCompare & pull request. After clicking on it, you should add atitle and description of the pull request. Underneath you can checkonce again the changes you have made.

    See also

    Get more details on naming the pull request in Arrow repositoryand other additional informationPull request and reviewsection.

Continuous Integration (CI)#

Continuous integration (CI) is an automated way to run tests andbuilds on different environments with the changed code made by aspecific pull request. It serves as a stability check before itgets merged or integrated into the main repository of the project.

Once the pull request is created, the CI will trigger checks on thecode. Depending on what part of the code was changed (documentation,C++ or other languages for example) the CI is configured to runthe relevant checks.

You will see checks running at the bottom of the pull request pageon GitHub. In case of an error, click on the details and research the causeof the failing build.

CI window showing the status of the code checks in case of changes made to the documentation.

CI checks for changes made to the documentation.#

CI window showing the status of the code checks in case of changes made to the python files

CI checks for changes made to the python files.#

CI window showing the status of the code checks in case of changes made to the R files.

CI checks for changes made to the R files.#

Besides the CI jobs that check the changes in GitHub repository(opening or merging of a pull request) we also use CI for nightlybuilds and releases of the Apache Arrow library.

Also, extended triggering jobs can be used in your pull request forexample adding a comment with@github-actionscrossbowsubmitpythonwill run PyArrow tests via GitHub actions. These are mostly used to runtests on environments that are less common and are normallynot needed in first time contributions.

To read more about this topic visitContinuous Integration.

Reviews and merge of the pull request#

When the pull request is submitted it waits to get reviewed. One ofgreat things about open source is that your work can get lots of feedback andso it gets perfected. Do not be discouraged by the time it takes forthe PR to get merged due to reviews and corrections. It is a processthat supports quality and with it you can learn a lot.

If it still takes too long to get merged, do not hesitate to remindmaintainers in the comment section of the pull request and postreminders on the GitHub issue also.

How to get your pull request to be reviewed?#

Arrow maintainers will be notified when a pull request is created andthey will get to it as soon as possible. If days pass and it still hadnot been reviewed go ahead and mention the reporter of the GitHub issueor a developer that you communicated with via mailing list or GitHub.

To put amention in GitHub insert @ in the comment and select theusername from the list.

Commenting on a pull request#

When a pull request is open in the repository you and other developerscan comment on the proposed solution.

To create a general comment navigate to theConversation tab ofyour pull request and start writing in the comment box at the bottom ofthe page.

You can also comment on a section of the file to point out somethingspecific from your code. To do this navigate toFiles changed tab andselect a line you want to insert the comment to. Hovering over the beginningof the line you will see ablue plus icon. You can click on it or dragit to select multiple lines and then click the icon to insert the comment.

Resolve conversation#

You can resolve a conversion in a pull request review by clickingResolve conversation in theFiles changed tab. This way theconversation will be collapsed and marked as resolved which will make iteasier for you to organize what is done and what still needs to be addressed.

Updating your pull request#

The procedure after getting reviews is similar to creating the initial pull request.You need to update your code locally, make a commit, update the branch to syncit with upstream and push your code to origin. It will automatically be updatedin your pull request as well.

The steps for updating the pull request would then be as follows:

  1. Updating the code locally and making a commit as before:

    $gitcommit-am"<message>"#if all changed files are to be committed
  2. Important! In case there are commits from other developers on the pullrequest branch or if you committed suggestions from the GitHub you needto update you code withorigin before rebasing! To do this run:

    $gitpullorigin<branch-name>

    Here we merge the new commits with our local branch and we do not rebase.

  3. Now we have to update the branch to sync with upstream main Arrow branch.This way the pull request will be able to get merged. We use rebase in thiscase.

    $gitpullupstreammain--rebase

    This will rebase your local commits on top of the tip ofupstream/main.

  4. Now you can push the changes by running:

    $gitpushorigin<branch-name>--force

    Note about force pushing to a branch that is being reviewed: if you wantreviewers to look at your updates, please ensure you comment on the PR onGitHub as simply force pushing does not trigger a notification in theGitHub user interface.

See also

See more about updating the branch (we userebase, notmerge)and squashing local commits inLocal git conventions.

If the review process is successful your pull request will get merged.

Congratulations! 🎉#