« bzr and different network protocols. | Main

November 8, 2007

Mozilla cvs file "moves" considered harmful.

As most of you know cvs doesn't know how to move or rename a file in a repository, i.e. if you have a file in mozilla/foo and you want to move it into mozilla/bar, cvs simply doesn't know how to record that. In a project the size of Mozilla, files inevitably need to move and be renamed now and then; cvs' inability to deal with that is one of the reasons Mozilla development is about to switch to using hg instead of cvs. hg (and most other modern version control systems) knows how to record a file/directory move/rename in the version history, so this is a non-issue once Mozilla development switches over full time to hg.

Over time people have come up with a couple of ways to deal with file moves in the Mozilla cvs repository. One way that I believe was used at some point way back was to simply move the files on the cvs server (the ,v files), this brings the history along with the files as they're moved, but this obviously also removes the files from their old location, not just for the current version and forward but also for old versions. This is obviously a bad idea, as pulling by date prior to this move will likely result in an unbuildable tree (as the build system is likely to expect the files to exist in the old location etc). A better alternative is to copy the files on the cvs server and doing a cvs remove of the old files, this avoids the above problem, but this also means that pulling by a date in the past will result in the files appearing in the new location as well as in the old location.

The current state of affairs is to use one of two scripts, both of which do more or less the same thing. These scripts basically replay a files history in the new location, either using the cvs client, or by doing this on the cvs server. This approach avoids both of the above problems, but they also invent changes to the files (dates and checkin order by date ends up incorrect etc).

Now enter the current world of active development happening in cvs, and all that (or the parts relevant for Firefox) being mirrored into our hg repository.

The script that mirrors the cvs tree into hg tracks the changes that go into cvs and figures out what of those changes were part of the same checkin (which is far from trivial btw), and you get the nice list of change sets you can see in the hg repository. So what happens then when we do one of these file moves? You get something like this. Not only is that incorrect history, but it's also impossible to get that right. The scripts that replay the checkins for the moved files obviously loose the dates of the checkins, and they also replay the history in the wrong order which in some cases makes it impossible to generate accurate change sets for those checkins. And of course it clutters the history (brings in old history we've already decided not to clutter the hg repository with), and makes it grow unnecessarily.

Given that, I think in general we'd be better off not moving files in cvs any more. Simply checking in the files in the new location with a checkin comment that explicitly states where the files were moved from and cvs removing the files from the old location should do. No history is lost, it's all there, getting to it just takes a few more steps as bonsai is perfectly capable of showing blame and logs for files that have been cvs removed.

The good news is that we're already in beta for Firefox 3, which means we'll probably be needing to move/rename fewer and fewer files anyways.

Posted by jst at November 8, 2007 11:05 PM

Comments

Not moving files in CVS sounds right. But why bother with CVS at all these days? Isn't the moment to retire it long overdue already? Why are you guys keeping up with shitty CVS for years now? It was painfully obsolete in 2004 already when the company where I worked migrated to svn and never looked back. Apache did the same thing for their entire code base before that. You guys have been doing insanely complicated release management and branching with hopelessly inadequate tools.

I'd say, release Firefox 3 in a few months; set the last CVS tag and make the repository read only once the tag is set. Switch over to hg permanently. That gives you a few months to prepare tooling & infrastructure. Anything that can't be fixed in that timeframe should be retired as well.

Of course the choice for hg means that tooling is quite primitive. You might be better off migrating cvs to svn and using that as a backend to push updates from hg. Many of the tools that made cvs popular are also available for svn so that is definitely the developer friendly route.

Posted by: Jilles van Gurp at November 9, 2007 3:16 AM

Jilles, that discussion (which repo mgmt system are we moving to) has been long since decided.

http://weblogs.mozillazine.org/preed/2007/04/version_control_system_shootou_1.html

Posted by: Alex Vincent at November 9, 2007 8:53 AM

I have to say, I was arguing against stopping the cvs-move/copy scripts we are using now in IRC.

but given this thought out blog post, and the (shown) results of it on an Hg changeset, I'm inclined to agree with your proposed solution (unless someone actually wants to fix the script[s] involved)

Posted by: Callek at November 9, 2007 5:50 PM

I remember this being one of the very first "spirited" IRC arguments I found myself in after I was recently hired at the Corporation.

I was on the side of "This is screwed up, and we need to stop doing it," and I was met with the standard "show me the harm" argument.

The "harm" tends to reveal itself in numerous small, "merely annoying" ways, like import tools not working on our repository or assumptions that are typically valid to make about CVS not being valid for our repository, or one of the few "CVS is horked" bugs in Bugzilla.

But, these tend to be quickly forgotten. I really never understood the problem with doing a delete and an add, as long as the developer annotated where they moved from, so you could dig up the history. I would trade forcing people to run the extra cvs command over corrupted repositories, but I guess I'm in the minority on that (and really... are we surprised by that? ;-)

Posted by: Preed at November 12, 2007 11:02 PM