Using 'rsync –exclude-from' to Exclude Files Containing Spaces

A few months ago I wrote a post about escaping filename or directory spaces for rsync. Well that wasn't the end of rsync giving me problems with spaces.

When I used the --exclude-from rsync option to specify a list of exclusions, I figured using single or double-quotes around files/directories that contain spaces would be enough to escape them. However, after swashing through hundreds and hundreds of lines from rsync's output, I discovered the excluded directories were still being synced!

When using --exclude-from, files and directories should not contain any single or double quotes, only a backslash:

/afs/*
/automount/*
/Users/raam/Documents/Virtual Machines/*

Note: A commenter pointed out that this no longer applies to the latest version of rsync. I tested this on Mac OS X 10.9 (Mavericks) and rsync v2.6.9 and confirmed that you no longer need to escape spaces in the exclude file.

Escaping Filename or Directory Spaces for rsync

To rsync a file or directory that contains spaces, you must escape both the remote shell and the local shell. I tried doing one or the other and it never worked. Now I know that I need to do both!

So let's say I'm trying to rsync a remote directory with my local machine and the remote directory contains a space (oh so unfortunately common with Windows files). Here's what the command should look like:

rsync '[email protected]:/path/with spaces/' /local/path/

The single quotes are used to escape the space for my local shell and the forward-slash is used to escape the remote shell.

Accidentally Using an Old Version of rsync

Last night I discovered I had been using an old version of rsync that did not use SSH by default! Something happened during my upgrade to Leopard that switched the default rsync from /usr/bin/rsync (v2.6.9) to an old version that must have been installed by fink in /sw/bin/rsync (v2.5.5)! I discovered the problem after a simple rsync command failed and I ran the command again with the -vv flags (to get a more verbose output). Sure enough, it was running with RSH and not SSH! Even the man page said RSH was the default! (On the upside, the suddenly broken PHP script I had written to help with deploying web projects, and which utilizes rsync over SSH, is not really broken after all.)

Using rsync to Mirror two CVS Repositories

I have two personal Linux servers, named Mercury (located in Lowell, MA) and Pluto (located in Cambridge, MA). Monday through Friday I stay in my Cambridge apartment to be close to work and on the weekends I go back to Lowell.

I've been storing all of my projects, both work and personal, in a CVS repository on Mercury. A few weeks ago, however, there was a power outage in Lowell during the middle of the week and Mercury didn't turn back on (probably because I don't have the "PWRON After PWR-Fail" BIOS option set to Former-STS, if it even has that option). So, since the computer wasn't on, I wasn't able to commit or sync any of the projects I was working on. This would normally not be a problem, however I have several staging scripts setup on Mercury which I use frequently to test my work -- so basically I was dead in the water.

After this incident, I realized I needed to mirror my CVS repository to prevent anything like that from happening again. This mirror would not only allow me to access the same CVS repository in the event that I was unable to reach one of my servers, but it would also act as a backup in case I somehow lost all the data on one of the servers.

After a little research using Google, I found this site which basically explains the -a option for rsync:

By far the most useful option is -a (--archive). This acts like the corresponding option to cp; rsync will:

* recurse subdirectories (-r);
* recreate soft links (-l);
* preserve permissions (-p);
* preserve timestamps (-t);
* attempt to copy devices (if invoked as root) (-D);
* preserve group information (-g) (and userid, if invoked as root) (-o).

Using that info, I ran the following command from Mercury (l.rd82.net is the DNS address I have mapped to it's public IP, and c.rd82.net is mapped to Pluto's public IP)

rsync -a /home/cvs [email protected]:/home

That's it! After waiting a few minutes (it took several minutes the first time) my entire CVS repository was copied to Pluto, my Cambridge Linux server. Of course, before I ran that command I had to create the CVS repository on Pluto first by running cvs init /home/cvs. After the rsync command completed, I added a CVS repository for c.rd82.net in Eclipse and confirmed that all my projects were there.

The only thing left to do is to setup a cron job to run the command every night. Of course, I'll need to setup SSH keys so the rsync command can run without user input, but thats easy.