Just a few nit-picks to add on top of (well, below) the other replies:
- When the user happens to enter a command-line arg that is not a usable directory name, nothing happens -- not even an error message to the user saying something like "you were supposed to supply a directory path". A little error checking might help.
- If you decide to stick with symlinks (rather than merlyn's suggestion of hard links), you could use the perl-internal "symlink" function, instead of shelling out to "ln -s" -- that can save some time, if you end up making a lot of links. (Update: of course, you first have to use "unlink" on the file being replaced, but I'd expect these two calls together are still cheaper than a whole backtick subprocess.)
- You are doing too many stat calls. You could get by just stat'ing each file once, re-using the stat structure as needed, and keeping info you want to use later; if the loop in the "files()" sub goes like this:
for $f (grep { ! ( /^\.{1,2}$/ or -l "$path/$_" ) } readdir(DIR)) {
if ( -f _ ) {
push @files, "$path/$f " . -s _;
elsif ( -d _ ) {
push @files, @{&files( "$path/$f" )};
}
}
then you have just one stat per file, and the map block in the caller would just be "split" instead of yet another round of stat calls. (perldoc -f stat explains about using the underscore to refer to "the existing stat structure")
Final update: these really are very minor issues -- they could be "optimizations" in some situations, but probably won't make a noticeable difference in how fast this app goes, given that most of the run time will be spent comparing file contents. Still, if it's easier to write code that runs a little faster, why not write it that way?