The Linux Page

CVS and really large files

Yesterday, I got a nice little surprise...

I inadvertendly added two copies of an uncompressed MS-Access data in my CVS. The files were over 400Mb (about 434Mb if I'm correct.) This started okay, then slowly I could see my memory going banana.

Wow! It took a good 20min. to check-in. My CVS is on my computer, so it is not the transfer that's slow... The fact is, the allocated a buffer for the entire file! My computer was on its knees!

Okay, so I decided that it should not be in the CVS because it would be a nightmare to do anything with such files over the Internet (some other people have access to my CVS via the Internet). Imagine 1Gb via my modem! He! He!

So... I went ahead and typed cvs remove <filenames...>. That took about 3 hours!!! It was swapping like crazy. I had more than enough swap space, but the processor was swapping for 3 hours and doing a tad bit of work every second. Okay, I only have 1Gb on this machine. But imagine how useful this is?!?

I guess that what is happening is that CVS allocates buffers for both: the new and the old version of the file, then applies a DIFF, saves the DIFF and finally overwrite the old file with the new file (which is the file with all the DIFF inside, but anyway...)

Of course, that's probably not directly CVS's fault since CVS is built over RCS.

Still... if you can avoid adding large files in your CVS. And that may very well apply to SVN and git.