I recently had the need to set up an automated synchronization of a large rsync module (CTAN, yes 'T' not 'P') where I only wanted certain sub-directories to be synchronized. While rsync allows for include/exclude filter the used scheme is a little counter-intuitive. Direct paths like /some/dir/I/want/ don't work because rsync never sees that last dir want if you don't include all parent directories as well. This forces you to include all of them but then exclude all other files and dirs in them, e.g.:

+ /some/ + /some/dir/ + /some/dir/I/ + /some/dir/I/want/*** - /some/dir/I/* - /some/dir/* - /some/* - /*

The /*** here means "that directory plus everything in it, including sub-dirs". Older rsync versions needed that as two instructions / and /**.

I wrote a Perl script for this which creates a hash-of-hash(-of-hashes)* to hold the directory structure. (Actually Data::Diver would be well suited for this, but I decided to do my own loop because it was simple enough and that package is not part of the current installation on the target server.) Then the hash structure is recursively processed to include all parent directories first and exclude all other things in it afterwards. I added key sorting to get a sorted filter list which is easily proof-read. I added two warnings for the case when a directory should be included fully and partially. In this case it is always included fully.

Here the code:

#!/usr/bin/perl ###################################################################### +########## # Copyright (c) 2011 Martin Scharrer <martin@scharrer-online.de> # This is open source software under the GPL v3 or later. ###################################################################### +########## use strict; use warnings; my $include = {}; sub add_include { INCLUDE_PATH: foreach my $path (@_) { chomp $path; my @dirs = split (/\//, $path); shift @dirs if @dirs and $dirs[0] eq ''; my $lastdir = pop @dirs; my $ref = $include; foreach my $dir (@dirs) { my $dirref = $ref->{$dir}; if (defined $dirref) { if (ref $dirref ne 'HASH') { warn "Warning: directory '$dir' of '$path' already + fully included!\n"; next INCLUDE_PATH; } $ref = $dirref; } else { my $newdir = {}; $ref = $ref->{$dir} = $newdir; } } if (exists $ref->{$lastdir} && $ref->{$lastdir} ne '1') { warn "Warning: '$path' now fully included!\n"; } $ref->{$lastdir} = '1'; } } sub print_include { my $pdir = shift; my $h = shift; print "+ $pdir/\n"; foreach my $dir (sort keys %$h) { my $value = $h->{$dir}; if (ref $value ne 'HASH') { print "+ $pdir/$dir/***\n"; ## For older rsync versions use the following instead: #print "+ $pdir/$dir/\n"; #print "+ $pdir/$dir/**\n"; } else { print_include ("$pdir/$dir", $value); } } print "- $pdir/*\n"; } if (@ARGV) { add_include (@ARGV); } else { add_include <STDIN>; } print_include ('', $include); __END__

Usage Example
rsyncfilter.pl /a/b /a/c /a/d/a /a/d/b

gives:

+ / + /a/ + /a/b/*** + /a/c/*** + /a/d/ + /a/d/a/*** + /a/d/b/*** - /a/d/* - /a/* - /*

Replies are listed 'Best First'.