Appendix: Migrating to Mercurial

A common way to test the waters with a new revision control tool is to experiment with switching an existing project, rather than starting a newproject from scratch.

In this appendix, we discuss how to import a project's history into Mercurial, and what to look out for if you are used to a different revisioncontrol system.

Importing history from another system

Mercurial ships with an extension namedconvert, which can import project history from most popular revision control systems. At the time thisbook was written, it could import history from the following systems:

  • Subversion
  • CVS
  • git
  • Darcs
  • Bazaar
  • Monotone
  • GNU Arch
  • Perforce
  • Mercurial

(To see why Mercurial itself is supported as a source, seesvn.filemap.)

You can enable the extension in the usual way, by editing your~/.hgrc file.

[extensions]convert =

This will make ahgconvert command available. The command is easy to use. For instance, this command will import the Subversion history for theNose unit testing framework into Mercurial.

$ hg convert http://python-nose.googlecode.com/svn/trunk

Theconvert extension operates incrementally. In other words, after you have runhgconvert once, running it again will import any new revisions committed after the first run began. Incremental conversion will only work if yourunhgconvert in the same Mercurial repository that you originally used, because theconvert extension saves some private metadata in anon-revision-controlled file named.hg/shamap inside the target repository.

When you want to start making changes using Mercurial, it's best to clone the tree in which you are doing your conversions, and leave the originaltree for future incremental conversions. This is the safest way to let you pull and merge future commits from the source revision control system intoyour newly active Mercurial project.

Converting multiple branches

Thehgconvert command given above converts only the history of thetrunk branch of the Subversion repository. If we instead use the URLhttp://python-nose.googlecode.com/svn, Mercurial will automatically detect thetrunk,tags andbranches layout that Subversionprojects usually use, and it will import each as a separate Mercurial branch.

By default, each Subversion branch imported into Mercurial is given a branch name. After the conversion completes, you can get a list of the activebranch names in the Mercurial repository usinghgbranches-a. If you would prefer to import the Subversion branches without names, pass the--configconvert.hg.usebranchnames=false option tohgconvert.

Once you have converted your tree, if you want to follow the usual Mercurial practice of working in a tree that contains a single branch, you canclone that single branch usinghgclone-rmybranchname.

Mapping user names

Some revision control tools save only short usernames with commits, and these can be difficult to interpret. The norm with Mercurial is to save acommitter's name and email address, which is much more useful for talking to them after the fact.

If you are converting a tree from a revision control system that uses short names, you can map those names to longer equivalents by passing a--authors option tohgconvert. This option accepts a file name that should contain entries of the following form.

arist = Aristotle <aristotle@phil.example.gr>soc = Socrates <socrates@phil.example.gr>

Wheneverconvert encounters a commit with the usernamearist in the source repository, it will use the nameAristotle<aristotle@phil.example.gr> in the converted Mercurial revision. If no match is found for a name, it is used verbatim.

Tidying up the tree

Not all projects have pristine history. There may be a directory that should never have been checked in, a file that is too big, or a whole hierarchythat needs to be refactored.

Theconvert extension supports the idea of a “file map” that can reorganize the files and directories in a project as it imports the project'shistory. This is useful not only when importing history from other revision control systems, but also to prune or refactor a Mercurial tree.

To specify a file map, use the--filemap option and supply a file name. A file map contains lines of the following forms.

# This is a comment.# Empty lines are ignored.include path/to/fileexclude path/to/filerename from/some/path to/some/other/place

Theinclude directive causes a file, or all files under a directory, to be included in the destination repository. This also excludes all otherfiles and dirs not explicitly included. Theexclude directive causes files or directories to be omitted, and others not explicitly mentioned tobe included.

To move a file or directory from one location to another, use therename directive. If you need to move a file or directory from a subdirectoryinto the root of the repository, use. as the second argument to therename directive.

Improving Subversion conversion performance

You will often need several attempts before you hit the perfect combination of user map, file map, and other conversion parameters. Converting aSubversion repository over an access protocol likessh orhttp can proceed thousands of times more slowly than Mercurial is capable ofactually operating, due to network delays. This can make tuning that perfect conversion recipe very painful.

The`svnsync <http://svn.collab.net/repos/svn/trunk/notes/svnsync.txt>`__ command can greatly speed up the conversion of a Subversion repository.It is a read-only mirroring program for Subversion repositories. The idea is that you create a local mirror of your Subversion tree, then convert themirror into a Mercurial repository.

Suppose we want to convert the Subversion repository for the Apache Continuum project into a Mercurial tree. First, we create a local Subversionrepository.

$ svnadmin create continuum-mirror

Next, we set up a Subversion hook thatsvnsync needs.

$ echo '#!/bin/sh' > continuum-mirror/hooks/pre-revprop-change$ chmod +x continuum-mirror/hooks/pre-revprop-change

We then initializesvnsync in this repository.

$ svnsync --init file://`pwd`/continuum-mirror https://svn.apache.org/repos/asf/continuum

Our next step is to begin thesvnsync mirroring process.

$ svnsync sync file://`pwd`/continuum-mirror

Finally, we import the history of our local Subversion mirror into Mercurial.

$ hg convert continuum-mirror

We can use this process incrementally if the Subversion repository is still in use. We runsvnsync to pull new changes into our mirror, thenhgconvert to import them into our Mercurial tree.

There are two advantages to doing a two-stage import withsvnsync. The first is that it uses more efficient Subversion network syncing code thanhgconvert, so it transfers less data over the network. The second is that the import from a local Subversion tree is so fast that you can tweakyour conversion setup repeatedly without having to sit through a painfully slow network-based conversion process each time.

Migrating from Subversion

Subversion is currently the most popular open source revision control system. Although there are many differences between Mercurial and Subversion,making the transition from Subversion to Mercurial is not particularly difficult. The two have similar command sets and generally uniform interfaces.

Philosophical differences

The fundamental difference between Subversion and Mercurial is of course that Subversion is centralized, while Mercurial is distributed. SinceMercurial stores all of a project's history on your local drive, it only needs to perform a network access when you want to explicitly communicatewith another repository. In contrast, Subversion stores very little information locally, and the client must thus contact its server for many commonoperations.

Subversion more or less gets away without a well-defined notion of a branch: which portion of a server's namespace qualifies as a branch is a matterof convention, with the software providing no enforcement. Mercurial treats a repository as the unit of branch management.

Scope of commands

Since Subversion doesn't know what parts of its namespace are really branches, it treats most commands as requests to operate at and below whateverdirectory you are currently visiting. For instance, if you runsvnlog, you'll get the history of whatever part of the tree you're looking at, not the tree as a whole.

Mercurial's commands behave differently, by defaulting to operating over an entire repository. Runhglog and it will tell you the history of the entire tree, no matter what part of the working directory you're visiting at the time. If youwant the history of just a particular file or directory, simply supply it by name, e.g.hglogsrc.

From my own experience, this difference in default behaviors is probably the most likely to trip you up if you have to switch back and forthfrequently between the two tools.

Multi-user operation and safety

With Subversion, it is normal (though slightly frowned upon) for multiple people to collaborate in a single branch. If Alice and Bob are workingtogether, and Alice commits some changes to their shared branch, Bob must update his client's view of the branch before he can commit. Since at thistime he has no permanent record of the changes he has made, he can corrupt or lose his modifications during and after his update.

Mercurial encourages a commit-then-merge model instead. Bob commits his changes locally before pulling changes from, or pushing them to, the serverthat he shares with Alice. If Alice pushed her changes before Bob tries to push his, he will not be able to push his changes until he pulls hers,merges with them, and commits the result of the merge. If he makes a mistake during the merge, he still has the option of reverting to the commit thatrecorded his changes.

It is worth emphasizing that these are the common ways of working with these tools. Subversion supports a safer work-in-your-own-branch model, but itis cumbersome enough in practice to not be widely used. Mercurial can support the less safe mode of allowing changes to be pulled in and merged on topof uncommitted edits, but this is considered highly unusual.

Published vs local changes

A Subversionsvncommit command immediately publishes changes to a server, where they can be seen by everyone who has read access.

With Mercurial, commits are always local, and must be published via ahgpush command afterwards.

Each approach has its advantages and disadvantages. The Subversion model means that changes are published, and hence reviewable and usable,immediately. On the other hand, this means that a user must have commit access to a repository in order to use the software in a normal way, andcommit access is not lightly given out by most open source projects.

The Mercurial approach allows anyone who can clone a repository to commit changes without the need for someone else's permission, and they can thenpublish their changes and continue to participate however they see fit. The distinction between committing and pushing does open up the possibility ofsomeone committing changes to their laptop and walking away for a few days having forgotten to push them, which in rare cases might leavecollaborators temporarily stuck.

Quick reference

SubversionMercurialNotes
svnaddhgadd 
svnblamehgannotate 
svncathgcat 
svncheckouthgclone 
svncleanupn/aNo cleanup needed
svncommithgcommit;hgpushhgpush publishes after commit
svncopyhgcloneTo create a new branch
svncopyhgcopyTo copy files or directories
svndeletehgremove 
svndiffhgdiff 
svnexporthgarchive 
svnhelphghelp 
svnimporthgaddremove;hgcommit 
svninfohgparents;hgsummaryShows what revision is checked outShows combined information
svninfohgshowconfigpathsShows what URL is checked out
svnlisthgmanifest 
svnloghglog 
svnmergehgmerge 
svnmkdirn/aMercurial does not track directories
svnmove(svnrename)hgmove(hgrename) 
svnresolvedhgresolve-m 
svnreverthgrevert 
svnstatushgstatus 
svnupdatehgpull-u 

Table: Subversion commands and Mercurial equivalents

Useful tips for newcomers

Under some revision control systems, printing a diff for a single committed revision can be painful. For instance, with Subversion, to see whatchanged in revision 104654, you must typesvndiff-r104653:104654. Mercurial eliminates the need to type the revision ID twice in this commoncase. For a plain diff,hgexport104654. For a log message followed by a diff,hglog-r104654-p.

When you runhgstatus without any arguments, it prints the status of the entire tree, with paths relative to the root of the repository. Thismakes it tricky to copy a file name from the output ofhgstatus into the command line. If you supply a file or directory name tohgstatus,it will print paths relative to your current location instead. So to get tree-wide status fromhgstatus, with paths that are relative to yourcurrent directory and not the root of the repository, feed the output ofhgroot intohgstatus. You can easily do this as follows on a Unix-like system:

$ hg status `hg root`