I think that would be worth testing. When you're talking about hundreds of thousands of 'mv $a $b' in a shell vs. the equivalent number of 'rename $a, $b' in a perl script, the time (and overall cpu resources) saved by the latter could be well worth the time it takes to write the perl script -- especially if the process is going to be repeated at regular intervals.
I have a handy "shell-loop" tool written in perl (posted here: shloop -- execute shell command on a list) which makes it easy to test this, using the standard "time" utility. I happened to have a set of 23 directories ("20*") holding a total of 3782 files, so I created a second set of 23 empty directories ("to*"), and tried the following:
This is on a standard intel desktop box running FreeBSD; I expect the results would be comparable (or more dramatic) on other unix flavors.# rename files from 20* -> to*: $ find 20* -type f | time shloop -e rename -s ^20:to 10.62 real 0.39 user 0.27 sys # now "mv" them back: $ find to* -type f | time shloop -e mv -s ^to:20 18.99 real 0.96 user 6.93 sys
The first case uses the perl-internal "rename" function call to relocate each of the 3782 files to a new directory; in the latter case, shloop opens a shell process ( open(SH, "|/bin/sh")) and prints 3782 successive "mv" commands to that shell. An interesting point to make here is that the first case also had the extra overhead of "growing" the new directories as files were added to them for the first time, whereas the target directories for the "mv" run were already big enough -- but the "mv" run still took almost twice as long (probably because of the overhead involved in creating and destroying all those sub-processes).
This is a bit of a "nonsense" example, of course. Presumably, a shell command like "mv foo/* bar/" (or 23 of them, to handle the above example) would be really quick, because lots of files are moved in a single (compiled) process. But I wrote shloop to handle cases where each individual file needed a name-change as well (e.g. rename "foo/*.bar" to "fub/*.bub"). For this sort of case, a pure shell loop approach has do something like o=`echo $i | sed s/foo(.*)bar/fub$1bub/`; mv $i $o on every iteration, which would take much longer than the "mv" example shown above.
So the moral is: don't underestimate the load imposed by standard shell utilities -- they don't actually scale all that well when invoked in large quantities.
In reply to Re^2: BASH vs Perl performance
by graff
in thread BASH vs Perl performance
by jcoxen
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |