Movatterモバイル変換

[0]ホーム

Jump to content

Distributed version control

Edit links

From Wikipedia, the free encyclopedia

Software engineering tool

The process of initializing a git repository. Git is one of the most popularly used distributed version control software.

Insoftware development,distributed version control (also known asdistributed revision control) is a form ofversion control in which the completecodebase, including its full history, is mirrored on every developer's computer.^[1] Compared tocentralized version control, this enables automatic managementbranching andmerging, speeds up most operations (except pushing and fetching), improves the ability to work offline, and does not rely on a single location for backups.^[1]^[2]^[3]Git, the world's most popular version control system,^[4] is a distributed version control system.

Distributed vs. centralized

[edit]

Distributed version control systems (DVCS) use apeer-to-peer approach toversion control, as opposed to theclient–server approach of centralized systems. Distributed revision control synchronizes repositories by transferringpatches from peer to peer. There is no single central version of the codebase; instead, each user has a working copy and the full change history.

Advantages of DVCS (compared with centralized systems) include:

Allows users to work productively when not connected to a network.
Common operations (such as commits, viewing history, and reverting changes) are faster for DVCS, because there is no need to communicate with a central server.^[5] With DVCS, communication is necessary only when sharing changes among other peers.
Allows private work, so users can use their changes even for early drafts they do not want to publish.^{[citation needed]}
Working copies effectively function as remote backups, which avoids relying on one physical machine as a single point of failure.^[5]
Allows various development models to be used, such as usingdevelopment branches or a Commander/Lieutenant model.^[6]
Permits centralized control of the "release version" of the project^{[citation needed]}
OnFOSS software projects it is much easier to create aproject fork from a project that is stalled because of leadership conflicts or design disagreements.

Disadvantages of DVCS (compared with centralized systems) include:

Initial checkout of a repository is slower as compared to checkout in a centralized version control system, because all branches and revision history are copied to the local machine by default.
The lack of locking mechanisms that is part of most centralized VCS and still plays an important role when it comes to non-mergeable binary files such as graphic assets or too complex single file binary or XML packages (e.g. office documents, PowerBI files, SQL Server Data Tools BI packages, etc.).^{[citation needed]}
Additional storage required for every user to have a complete copy of the complete codebase history.^[7]
Increased exposure of the code base since every participant has a locally vulnerable copy.^{[citation needed]}

Some originally centralized systems now offer some distributed features.Team Foundation Server and Visual Studio Team Services now host centralized and distributed version control repositories via hosting Git.

Similarly, some distributed systems now offer features that mitigate the issues of checkout times and storage costs, such as theVirtual File System for Git developed by Microsoft to work with very large codebases,^[8] which exposes a virtual file system that downloads files to local storage only as they are needed.

Work model

[edit]

This sectionneeds expansion. You can help byadding to it.(June 2008)

A distributed model is generally better suited for large projects with partly independent developers, such as theLinux Kernel. It allows developers to work in independent branches and apply changes that can later be committed, audited and merged (or rejected)^[9] by others. This model allows for better flexibility and permits for the creation and adaptation of custom source code branches (forks) whose purpose might differ from the original project. In addition, it permits developers to locally clone an existing code repository and work on such from a local environment where changes are tracked and committed to the local repository^[10] allowing for better tracking of changes before being committed to the master branch of the repository. Such an approach enables developers to work in local and disconnected branches, making it more convenient for larger distributed teams.

Central and branch repositories

[edit]

In a truly distributed project, such asLinux, every contributor maintains their own version of the project, with different contributors hosting their own respective versions and pulling in changes from other users as needed, resulting in a general consensus emerging from multiple different nodes. This also makes the process of "forking" easy, as all that is required is one contributor stop accepting pull requests from other contributors and letting the codebases gradually grow apart.

This arrangement, however, can be difficult to maintain, resulting in many projects choosing to shift to a paradigm in which one contributor is the universal "upstream", a repository from whom changes are almost always pulled. Under this paradigm, development is somewhat recentralized, as every project now has a central repository that is informally considered as the official repository, managed by the project maintainers collectively. While distributed version control systems make it easy for new developers to "clone" a copy of any other contributor's repository, in a central model, new developers always clone the central repository to create identical local copies of the code base. Under this system, code changes in the central repository are periodically synchronized with the local repository, and once the development is done, the change should be integrated into the central repository as soon as possible.

Organizations utilizing this centralize pattern often choose to host the central repository on a third party service likeGitHub, which offers not only more reliableuptime than self-hosted repositories, but can also add centralized features likeissue trackers andcontinuous integration.

Pull requests

[edit]

Contributions to a source code repository that uses a distributed version control system are commonly made by means of apull request, also known as amerge request.^[11] The contributor requests that the project maintainerpull the source code change, hence the name "pull request". The maintainer has tomerge the pull request if the contribution should become part of the source base.^[12]

The developer creates a pull request to notify maintainers of a new change; a comment thread is associated with each pull request. This allows forfocused discussion of code changes. Submitted pull requests are visible to anyone with repository access. A pull request can be accepted or rejected by maintainers.^[13]

Once the pull request is reviewed and approved, it is merged into the repository. Depending on the established workflow, the code may need to be tested before being included into official release. Therefore, some projects contain a special branch for merging untested pull requests.^[12]^[14] Other projects run an automated test suite on every pull request, using acontinuous integration tool, and the reviewer checks that any new code has appropriate test coverage.

History

[edit]

The first open-source DVCS systems includedArch,Monotone, andDarcs. However, open source DVCSs were never very popular until the release ofGit andMercurial.

BitKeeper was used in the development of theLinux kernel from 2002 to 2005.^[15] The development ofGit, now the world's most popular version control system,^[4] was prompted by the decision of the company that made BitKeeper to rescind the free license that Linus Torvalds and some other Linux kernel developers had previously taken advantage of.^[15]

References

[edit]

^^a ^bChacon, Scott; Straub, Ben (2014)."About version control".Pro Git (2nd ed.). Apress. Chapter 1.1. Retrieved4 June 2019.
^Spolsky, Joel (17 March 2010)."Distributed Version Control Is Here to Stay, Baby".Joel on Software. Retrieved4 June 2019.
^"Intro to Distributed Version Control (Illustrated)".www.betterexplained.com. Retrieved7 January 2018.
^^a ^b"Version Control Systems Popularity in 2016".www.rhodecode.com. Retrieved7 January 2018.
^^a ^bO'Sullivan, Bryan."Distributed revision control with Mercurial". RetrievedJuly 13, 2007.
^Chacon, Scott; Straub, Ben (2014)."Distributed workflows".Pro Git (2nd ed.). Apress. Chapter 5.1.
^"What is version control: centralized vs. DVCS".www.atlassian.com. 14 February 2012. Retrieved7 January 2018.
^Jonathan Allen (2017-02-08)."How Microsoft Solved Git's Problem with Large Repositories". Retrieved2019-08-06.
^"Submitting patches: the essential guide to getting your code into the kernel — The Linux Kernel documentation".www.kernel.org. Retrieved2024-11-22.
^"Git - Revision Selection".git-scm.com. Retrieved2024-11-22.
^Sijbrandij, Sytse (29 September 2014)."GitLab Flow".GitLab. Retrieved4 August 2018.
^^a ^bJohnson, Mark (8 November 2013)."What is a pull request?".Oaawatch. Retrieved27 March 2016.
^"Using pull requests". GitHub. Retrieved27 March 2016.
^"Making a Pull Request". Atlassian. Retrieved27 March 2016.
^^a ^bMcAllister, Neil."Linus Torvalds' BitKeeper blunder".InfoWorld. Retrieved2017-03-19.

External links

[edit]

Essay on various revision control systems, especially the section "Centralized vs. Decentralized SCM"
Introduction to distributed version control systems - IBM Developer Works article

Version control software

Years, where available, indicate the date of first stable release. Systems with namesin italics are no longer maintained or have planned end-of-life dates.

Local only

Free/open-source	RCS (1982) SCCS (1973)
Proprietary	The Librarian (1969) Panvalet (1970s) PVCS (1985) QVCS (1991)

Client–server

Free/open-source	CVS (1986, 1990 in C) CVSNT (1998) QVCS Enterprise (1998) Subversion (2000)
Proprietary	AccuRev SCM (2002) Azure DevOps Server (viaTFVC) (2005) Services (viaTFVC) (2014) ClearCase (1992) CMVC (1994) Dimensions CM (1980s) DSEE (1984) Integrity (2001) Perforce Helix (1995) SCLM (1980s?) Software Change Manager (1970s) StarTeam (1995) Surround SCM (2002) Synergy (1990) Team Concert (2008) Vault (2003) Visual SourceSafe (1994)

Distributed

Free/open-source	BitKeeper (2000) Breezy (2017) Code Co-op (1997) Darcs (2002) DCVS (2002) Fossil (2007) Git (2005) GNU arch (2001) GNU Bazaar (2005) Mercurial (2005) Monotone (2003)
Proprietary	Azure DevOps Server (via Git) (2013) Services (via Git) (2014) TeamWare (1992) Plastic SCM (2006)