A few people have asked my if I chose the parameters for Git’s repack correctly. Shouldn’t I use a higher
--depth value than the default? Why did I pick a
--window value of 250? Shouldn’t I have repacked with the default values?
To answer this first question last: no. I did these conversions as best as I could, in order to make a fair comparison. My assumption is that anyone converting their repository to Bazaar, Git or Mercurial knows what he or she is doing. Then why should I settle for less? Repacking a repository as tightly as I did is necessary only once, but it is an important step:
git fast-import creates really bad packs in order to be fast, so a repack really helps. It is also suggested in the manpage to use a higher —window value than normal.
However, it got me curious on how the parameters (
--window) influence final repository size. First I wanted to see if changing the
--depth would have made a difference in final size. I repacked all repositories, with a depth value of either 50 (the default) or 100. I varied the window parameter over the values [10, 20, 50, 100, 150, 200, 250].
First let’s look at how the depth variable influences repository size.
As can be seen from this figure, increase of repack depth only influences repository size on a repack with a small window. As I used a window size of 250, the depth variable did not influence results much.
However, it’s also interesting to see how these variables affect other parameters. An example of this is repack time.
Repack time still increases with increasing window size. As a repository won’t be packed much tighter on a
window of 250 than on a
window of 100, you might as well choose a lower value for your window when doing an aggressive repack.
However, there is a more interesting interaction going on: the effect of the window parameter depends on the size of your repository. Let’s look at repositories of different sizes (See “Meet the Candidates” for a description of the repositories):
As can be seen, a higher window value will have an effect only on repositories that are actually quite large, like the emacs repository. If you have a small repository, there’s not much use to repacking with anything higher than
--window=50, but if your repository is several hundreds MB’s, it skim off a few more megs.
(Please note that the repack times are done on an Intel iMac Core Duo, 2Ghz with 2GB RAM running OS X. Repacks are done with
git repack -adf, which means that a repository will be completely packed. If you do a normal, incremental repack, expect to see much faster repacks.)