Patched svndumpfilter2

update: Simon Tatham has applied a better patch that deals with quotes in paths, as well as whitespace (quotes in svn paths - who knew!?). It's available as rev r7468 of svndumpfilter2.

Sometimes you want to export part of a Subversion repository. leaving the rest behind while keeping the repository history and metadata. The tool for this job is svndumpfilter which operates on Subversion dumpfiles. But svndumpfilter has a serious flaw - if a file or path was copied from a path you're filtering out to one you're filtering in, svndumpfilter won't be able to fill out the history and the job will fail. Simon Tatham's svndumpflter2 cleverly fixes this by looking up paths against the source repository the dumpfile was taken from, using svnlook.

In turn svndumpflter2 has a tiny bug; if a repository path being checked has white space in its name*, and is passed to svnlook as is, svnlook will only read up to the first whitespace, which results in a "path not found" error. A simple fix - placing all arguments to svnlook in the script inside quotes does the trick. As is often the case most of the work was running down why svndumpfilter doesn't work with copies, why svndumpfilter2 was reporting bad paths to begin with, and documenting what was done (this post). Otherwise svndumpfilter2 comes highly recommended - the repository I was working against is on the large side; "du -sh" on its repo folder comes in at 2.5Gb (the checkout is much bigger), with nearly 20,000 commits.

You can get a patched file here - svndumpfilter2. Alternatively "hg clone" the mercurial repository from http://www.dehora.net/hg/tools/ **.

* for this and many other reasons, avoiding whitespace in file names tends to be a good policy.

** Students of irony are welcome to savour the notion of keeping a subversion tool inside a mercurial repository.

April 19, 2007 03:50 PM


