Participate in Contributing
This guide documents the best way to make various types of contribution to Apache SeaTunnel,including what is required before submitting a code change.
Contributing to SeaTunnel doesn't just mean writing code. Helping new users on the mailing list,testing releases, and improving documentation are also welcome. In fact, proposing significantcode changes usually requires first gaining experience and credibility within the community byhelping in other ways. This is also a guide to becoming an effective contributor.
So, this guide organizes contributions in order that they should probably be considered by newcontributors who intend to get involved long-term. Build some track record of helping others,rather than just open pull requests.
Contributing by helping other users
A great way to contribute to SeaTunnel is to help answer user questions on thedev@seatunnel.apache.org
mailing list or on StackOverflow. There are always many new SeaTunnel users; taking a few minutes tohelp answer a question is a very valuable community service.
Contributors should subscribe to this list and follow it in order to keep up to date on what'shappening in SeaTunnel. Answering questions is an excellent and visible way to help the community,which also demonstrates your expertise.
See theMailing Lists guide for guidelinesabout how to effectively participate in discussions on the mailing list, as well as forumslike ISSUE.
Contributing by testing releases
SeaTunnel's release process is community-oriented, and members of the community can vote on newreleases on thedev@seatunnel.apache.org
mailing list. SeaTunnel users are invited to subscribe tothis list to receive announcements, and test their workloads on newer release and providefeedback on any performance or correctness issues found in the newer release.
Contributing by reviewing changes
Changes to SeaTunnel source code are proposed, reviewed and committed viaGitHub pull requests (described later).Anyone can view and comment on active changes here.Reviewing others' changes is a good way to learn how the change process works and gain exposureto activity in various parts of the code. You can help by reviewing the changes and askingquestions or pointing out issues -- as simple as typos or small issues of style.
Contributing documentation changes
To propose a change torelease documentation (that is, docs that appear underdocsedit the Markdown source files in SeaTunnel'sdocs directory,whoseREADME
file shows how to build the documentation locally to test your changes.The process to propose a doc change is otherwise the same as the process for proposing codechanges below.
To propose a change to the rest of the documentation (that is, docs that donot appear underdocs , similarly, edit the Markdown in thewebsite and open a pull request.
Contributing bug reports
Ideally, bug reports are accompanied by a proposed code change to fix the bug. This isn'talways possible, as those who discover a bug may not have the experience to fix it. A bugmay be reported by creating a ISSUE but without creating a pull request (see below).
Bug reports are only useful however if they include enough information to understand, isolateand ideally reproduce the bug. Simply encountering an error does not mean a bug should bereported; as below, search ISSUE and search and inquire on the SeaTunnel user / dev mailing listsfirst. Unreproducible bugs, or simple error reports, may be closed.
It's very helpful if the bug report has a description about how the bug was introduced, bywhich commit, so that reviewers can easily understand the bug. It also helps committers todecide how far the bug fix should be backported, when the pull request is merged. The pullrequest to fix the bug should narrow down the problem to the root cause.
Performance regression is also one kind of bug. The pull request to fix a performance regressionmust provide a benchmark to prove the problem is indeed fixed.
Note that, data correctness/data loss bugs are very serious. Make sure the corresponding bugreport ISSUE ticket is labeled ascorrectness
ordata-loss
. If the bug report doesn't getenough attention, please send an email todev@seatunnel.apache.org
, to draw more attentions.
It is possible to propose new features as well. These are generally not helpful unlessaccompanied by detail, such as a design document and/or code change. Large new contributionsshould consider be discussed on the mailing list first.Feature requests may be rejected, or closed after a long period of inactivity.
Contributing to ISSUE maintenance
Given the sheer volume of issues raised in the Apache SeaTunnel ISSUE, inevitably some issues areduplicates, or become obsolete and eventually fixed otherwise, or can't be reproduced, or couldbenefit from more detail, and so on. It's useful to help identify these issues and resolve them,either by advancing the discussion or even resolving the ISSUE. Most contributors are able todirectly resolve ISSUEs. Use judgment in determining whether you are quite confident the issueshould be resolved, although changes can be easily undone. If in doubt, just leave a commenton the ISSUE.
When resolving ISSUEs, observe a few useful conventions:
- Resolve asFixed if there's a change you can point to that resolved the issue
- Set Fix Version(s), if and only if the resolution is Fixed
- Set Assignee to the person who most contributed to the resolution, which is usually the personwho opened the PR that resolved the issue.
- In case several people contributed, prefer to assign to the more 'junior', non-committer contributor
- For issues that can't be reproduced against master as reported, resolve asCannot Reproduce
- Fixed is reasonable too, if it's clear what other previous pull request resolved it. Link to it.
- If the issue is the same as or a subset of another issue, resolved asDuplicate
- Make sure to link to the ISSUE it duplicates
- Prefer to resolve the issue that has less activity or discussion as the duplicate
- If the issue seems clearly obsolete and applies to issues or components that have changedradically since it was opened, resolve asNot a Problem
- If the issue doesn't make sense – not actionable, for example, a non-SeaTunnel issue, resolveasInvalid
- If it's a coherent issue, but there is a clear indication that there is not support or interestin acting on it, then resolve asWon't Fix
- Umbrellas are frequently markedDone if they are just container issues that don't correspondto an actionable change of their own
Preparing to contribute code changes
Choosing what to contribute
Review can take hours or days of committer time. Everyone benefits if contributors focus onchanges that are useful, clear, easy to evaluate, and already pass basic checks.
Sometimes, a contributor will already have a particular new change or bug in mind. If seekingideas, consult the list of starter tasks in ISSUE, or ask thedev@seatunnel.apache.org
mailing list.
Before proceeding, contributors should evaluate if the proposed change is likely to be relevant,new and actionable:
- Is it clear that code must change? Proposing a ISSUE and pull request is appropriate only when aclear problem or change has been identified. If simply having trouble using SeaTunnel, use the mailinglists first, rather than consider filing a ISSUE or proposing a change. When in doubt, email
dev@seatunnel.apache.org
first about the possible change - Search the
dev@seatunnel.apache.org
mailing list forrelated discussions.Often, the problem has been discussed before, with a resolution that doesn't require a codechange, or recording what kinds of changes will not be accepted as a resolution. - Search ISSUE for existing issues:ISSUES
- Type
SeaTunnel [search terms]
at the top right search box. If a logically similar issue alreadyexists, then contribute to the discussion on the existing ISSUE and pull request first, instead ofcreating a new one. - Is the scope of the change matched to the contributor's level of experience? Anyone is qualifiedto suggest a typo fix, but refactoring core scheduling logic requires much more understanding ofSeaTunnel. Some changes require building up experience first (see above).
It's worth reemphasizing that changes to the core of SeaTunnel, or to highly complex and important modules are more difficult to make correctly. They will be subjected to more scrutinyand held to a higher standard of review than changes to less critical code.
Error message guidelines
Exceptions thrown in SeaTunnel should be associated with standardized and actionableerror messages.
Error messages should answer the following questions:
- What was the problem?
- Why did the problem happen?
- How can the problem be solved?
When writing error messages, you should:
- Use active voice
- Avoid time-based statements, such as promises of future support
- Use the present tense to describe the error and provide suggestions
- Provide concrete examples if the resolution is unclear
- Avoid sounding accusatory, judgmental, or insulting
- Be direct
- Do not use programming jargon in user-facing errors
Code review criteria
Before considering how to contribute code, it's useful to understand how code is reviewed,and why changes may be rejected. See thedetailed guide for code reviewersfrom Google's Engineering Practices documentation.Simply put, changes that have many or largepositives, and few negative effects or risks, are much more likely to be merged, and merged quickly.Risky and less valuable changes are very unlikely to be merged, and may be rejected outrightrather than receive iterations of review.
Positives
- Fixes the root cause of a bug in existing functionality
- Adds functionality or fixes a problem needed by a large number of users
- Simple, targeted
- Easily tested; has tests
- Reduces complexity and lines of code
- Change has already been discussed and is known to committers
Negatives, risks
- Band-aids a symptom of a bug only
- Introduces complex new functionality, especially an API that needs to be supported
- Adds complexity that only helps a niche use case
- Changes a public API or semantics (rarely allowed)
- Adds large dependencies
- Changes versions of existing dependencies
- Adds a large amount of code
- Makes lots of modifications in one "big bang" change
Contributing code changes
Please review the preceding section before proposing a code change. This section documents how to do so.
When you contribute code, you affirm that the contribution is your original work and that youlicense the work to the project under the project's open source license. Whether or not you statethis explicitly, by submitting any copyrighted material via pull request, email, or other meansyou agree to license the material under the project's open source license and warrant that youhave the legal authority to do so.
Cloning the Apache SeaTunnel™ source code
If you are interested in working with the newest under-development code or contributing to Apache SeaTunnel development, you can check out the master branch from Git:
# Master development branch
git clone git@github.com:apache/incubator-seatunnel.git
Once you've downloaded SeaTunnel, you can find instructions for installing and building it on thedocumentation page
ISSUE
Generally, SeaTunnel uses ISSUE to track logical issues, including bugs and improvements, and usesGitHub pull requests to manage the review and merge of specific code changes. That is, ISSUEs areused to describewhat should be fixed or changed, and high-level approaches, and pull requestsdescribehow to implement that change in the project's source code. For example, major designdecisions are discussed in ISSUE.
- Find the existing SeaTunnel ISSUE that the change pertains to.
- Do not create a new ISSUE if creating a change to address an existing issue in ISSUE; add tothe existing discussion and work instead
- Look for existing pull requests that are linked from the ISSUE, to understand if someone isalready working on the ISSUE
- If the change is new, then it usually needs a new ISSUE. However, trivial changes, where thewhat should change is virtually the same as the how it should change do not require a ISSUE.Example:
Fix typos in Foo scaladoc
- If required, create a new ISSUE:
- Provide a descriptive Title. "Update web UI" or "Problem in scheduler" is not sufficient."Kafka Streaming support fails to handle empty queue in YARN cluster mode" is good.
- Write a detailed Description. For bug reports, this should ideally include a shortreproduction of the problem. For new features, it may include a design document.
- Set required fields:
- Issue Type. Generally, Bug, Improvement and New Feature are the only types used in SeaTunnel.
- Priority. Set to Major or below; higher priorities are generally reserved forcommitters to set. The main exception is correctness or data-loss issues, which can be flagged asBlockers. ISSUE tends to unfortunately conflate "size" and "importance" in itsPriority field values. Their meaning is roughly:
- Blocker: pointless to release without this change as the release would be unusableto a large minority of users. Correctness and data loss issues should be considered Blockers for their target versions.
- Critical: a large minority of users are missing important functionality withoutthis, and/or a workaround is difficult
- Major: a small minority of users are missing important functionality without this,and there is a workaround
- Minor: a niche use case is missing some support, but it does not affect usage oris easily worked around
- Trivial: a nice-to-have change but unlikely to be any problem in practice otherwise
- Component
- Affects Version. For Bugs, assign at least one version that is known to exhibit theproblem or need the change
- Label. Not widely used, except for the following:
correctness
: a correctness issuedata-loss
: a data loss issuerelease-notes
: the change's effects need mention in release notes. The ISSUE or pull requestshould include detail suitable for inclusion in release notes -- see "Docs Text" below.starter
: small, simple change suitable for new contributors
- Docs Text: For issues that require an entry in the release notes, this should contain theinformation that the release manager should include in Release Notes. This should include a short summaryof what behavior is impacted, and detail on what behavior changed. It can be provisionally filled outwhen the ISSUE is opened, but will likely need to be updated with final details when the issue isresolved.
- Do not set the following fields:
- Fix Version. This is assigned by committers only when resolved.
- Target Version. This is assigned by committers to indicate a PR has been accepted forpossible fix by the target version.
- Do not include a patch file; pull requests are used to propose the actual change.
- If the change is a large change, consider inviting discussion on the issue at
dev@seatunnel.apache.org
first before proceeding to implement the change.
Pull request
- Fork the GitHub repository atincubator-seatunnel if you haven't already
- Clone your fork, create a new branch, push commits to the branch.
- Consider whether documentation or tests need to be added or updated as part of the change,and add them as needed.
- When you add tests, make sure the tests are self-descriptive.
- Also, you should consider writing a ISSUE ID in the tests when your pull request targets to fixa specific issue. In practice, usually it is added when a ISSUE type is a bug or a PR addsa couple of tests to an existing test class. See the examples below:
- Scala
test("SeaTunnel-12345: a short description of the test") {
... - Java
@Test
public void testCase() {
// SeaTunnel-12345: a short description of the test
...
- Scala
The review process
- Other reviewers, including committers, may comment on the changes and suggest modifications.Changes can be added by simply pushing more commits to the same branch.
- Lively, polite, rapid technical debate is encouraged from everyone in the community. The outcomemay be a rejection of the entire change.
- Keep in mind that changes to more critical parts of SeaTunnel, like its core components, willbe subjected to more review, and may require more testing and proof of its correctness thanother changes.
- Reviewers can indicate that a change looks suitable for merging with a comment such as: "I thinkthis patch looks good". SeaTunnel uses the LGTM convention for indicating the strongest level oftechnical sign-off on a patch: simply comment with the word "LGTM". It specifically means: "I'velooked at this thoroughly and take as much ownership as if I wrote the patch myself". If youcomment LGTM you will be expected to help with bugs or follow-up issues on the patch. Consistent,judicious use of LGTMs is a great way to gain credibility as a reviewer with the broader community.
- Sometimes, other changes will be merged which conflict with your pull request's changes. ThePR can't be merged until the conflict is resolved. This can be resolved by, for example, adding a remoteto keep up with upstream changes by
git remote add upstream git@github.com:apache/incubator-seatunnel.git
,runninggit fetch upstream
followed bygit rebase upstream/master
and resolving the conflicts by hand,then pushing the result to your branch. - Try to be responsive to the discussion rather than let days pass between replies
Closing your pull request / ISSUE
- If a change is accepted, it will be merged and the pull request will automatically be closed,along with the associated ISSUE if any
- Note that in the rare case you are asked to open a pull request against a branch besides
master
, that you will actually have to close the pull request manually - The ISSUE will be Assigned to the primary contributor to the change as a way of giving credit.If the ISSUE isn't closed and/or Assigned promptly, comment on the ISSUE.
- Note that in the rare case you are asked to open a pull request against a branch besides
- If your pull request is ultimately rejected, please close it promptly
- ... because committers can't close PRs directly
- Pull requests will be automatically closed by an automated process at Apache after about aweek if a committer has made a comment like "mind closing this PR?" This means that thecommitter is specifically requesting that it be closed.
- If a pull request has gotten little or no attention, consider improving the description orthe change itself and ping likely reviewers again after a few days. Consider proposing achange that's easier to include, like a smaller and/or less invasive change.
- If it has been reviewed but not taken up after weeks, after soliciting review from themost relevant reviewers, or, has met with neutral reactions, the outcome may be considered a"soft no". It is helpful to withdraw and close the PR in this case.
- If a pull request is closed because it is deemed not the right approach to resolve a ISSUE,then leave the ISSUE open. However if the review makes it clear that the issue identified inthe ISSUE is not going to be resolved by any pull request (not a problem, won't fix) then alsoresolve the ISSUE.
If in doubt
If you're not sure about the right style for something, try to follow the style of the existingcodebase. Look at whether there are other examples in the code that use your feature. Feel freeto ask on thedev@seatunnel.apache.org
list as well and/or ask committers.
Code of conduct
The Apache SeaTunnel project follows theApache Software Foundation Code of Conduct. Thecode of conduct applies to all spaces managed by the Apache Software Foundation, including IRC, all public and private mailing lists, issue trackers, wikis, blogs, Twitter, and any other communication channel used by our communities. A code of conduct which is specific to in-person events (ie., conferences) is codified in the published ASF anti-harassment policy.
We expect this code of conduct to be honored by everyone who participates in the Apache community formally or informally, or claims any affiliation with the Foundation, in any Foundation-related activities and especially when representing the ASF, in any role.
This codeis not exhaustive or complete. It serves to distill our common understanding of a collaborative, shared environment and goals. We expect it to be followed in spirit as much as in the letter, so that it can enrich all of us and the technical communities in which we participate.
For more information and specific guidelines, refer to theApache Software Foundation Code of Conduct .
Acknowledgement: This document refers toSpark