Movatterモバイル変換
[0]ホーム
[Python-Dev] Fwd: PEP: Migrating the Python CVS to Subversion
Daniel Berlindberlin at dberlin.org
Mon Aug 15 00:25:02 CEST 2005
On Sun, 2005-08-14 at 23:58 +0200, "Martin v. Löwis" wrote:> Guido van Rossum wrote:> > Here's another POV.>> I think I agree with Daniel's view, in particular wrt. to performance.> Whatever the replacement tool, it should perform as well or better> than CVS currently does; it also shouldn't perform much worse than> subversion.Then, in fairness, I should note that annotate is slower on subversion(and monotone, and anything using binary deltas) than CVS.This is because you can't generate line-diffs that annotate wants frombinary copy + add diffs. You have to reconstruct the actual revisionsand then line diff them. Thus, CVS is O(N) here, and SVN and otherbinary delta users are O(N^2).You wouldn't really notice the speed difference when you are annotatinga file with 100 revisions. You would if you annotate the 800k changelogwhich has 30k trunk revisions. CVS takes 4 seconds, svn takes ~5minutes, the whole time being spent in doing diffs of those revisions.I rewrote the blame algorithm recently so that it will only take about 2minutes on changelog, but it cheats because it knows it can stop earlybecause it's blamed all the revisions (since our changelog rotates).For those curious, you also can't directly generate "always-correct"byte-level differences from the diffs, since their goal is to find themost space efficient way to transform rev old into rev new, *not* recordactual byte-level changes that occurred between old and new. It mayturn out that doing an add of 2 bytes is cheaper than specifying theopcode for copy(start,len). Actual diffs are produced by reproducingthe texts and line diffing them. Such is the cost of efficientstorage :).>> I've been using git (or, rather, cogito) to keep up-to-date with the> Linux kernel. While performance of git is really good, storage> requirements are *quite* high, and initial "checkout" takes a long> time - even though the Linux kernel repository stores virtual no> history (there was a strict cut when converting the bitkeeper HEAD).> So these distributed tools would cause quite some disk consumption> on client machines. bazaar-ng apparently supports only-remote> repositories as well, so that might be no concern.The argument "network and disk is cheap" doesn't work for us when youare talking 5-10 gigabytes of initial transfer :). However, I doubtit's more than a hundred meg or so for python, if that.You may run into these problems in 10 years :)
More information about the Python-Devmailing list
[8]ページ先頭