in reply to Simple Perl file rename

For the more general case, there's Larry's filename fixer, from the Perl Cookbook (and I'm sure I've seen it in print elsewhere, too):
#!/usr/bin/perl # -w switch is off bc HERE docs cause erroneous messages to be display +ed under Cygwin #From the Perl Cookbook, Ch. 9.9 # rename - Larry's filename fixer $help = <<EOF; Usage: rename expr [files] This script's first argument is Perl code that alters the filename (st +ored in \$_ ) to reflect how you want the file renamed. It can do thi +s because it uses an eval to do the hard work. It also skips rename c +alls when the filename is untouched. This lets you simply use wildcar +ds like rename EXPR * instead of making long lists of filenames. Here are five examples of calling the rename program from your shell: % rename 's/\.orig$//' *.orig % rename 'tr/A-Z/a-z/ unless /^Make/' * % rename '$_ .= ".bad"' *.f % rename 'print "$_: "; s/foo/bar/ if <STDIN> =~ /^y/i' * % find /tmp -name '*~' -print | rename 's/^(.+)~$/.#$1/' The first shell command removes a trailing ".orig" from each filename. The second converts uppercase to lowercase. Because a translation is u +sed rather than the lc function, this conversion won't be locale-awar +e. To fix that, you'd have to write: % rename 'use locale; $_ = lc($_) unless /^Make/' * The third appends ".bad" to each Fortran file ending in ".f", somethin +g a lot of us have wanted to do for a long time. The fourth prompts the user for the change. Each file's name is printe +d to standard output and a response is read from standard input. If t +he user types something starting with a "y" or "Y", any "foo" in the +filename is changed to "bar". The fifth uses find to locate files in /tmp that end with a tilde. It +renames these so that instead of ending with a tilde, they start with + a dot and a pound sign. In effect, this switches between two common +conventions for backup files EOF $op = shift or die $help; chomp(@ARGV = <STDIN>) unless @ARGV; for (@ARGV) { $was = $_; eval $op; die $@ if $@; rename($was,$_) unless $was eq $_; }
(Notice that about 90% of that is comments and usage.)

Using that script, you can say:

rename 's/ABC //' *.doc
to rename all files with 'ABC ' in the name.

Replies are listed 'Best First'.
Re^2: Simple Perl file rename
by Anonymous Monk on Jul 28, 2008 at 17:46 UTC
    Hiya,

    I am trying to understand Larry's filename fixer. I saw it first in the cookbook and am now looking here to find help. I am new to Perl, but I think (I hope) I understand everything the script does. The only thing I don't get is how all the matching filenames end up in @ARGV. So if I call the rename script in a directory with three txt files like so:

    rename 's/foo/bar/' *.txt

    @ARGV would be an Array with four entries like this right?

    s/foo/bar/\n
    file1.txt\n
    fiel2.txt\n
    file3.txt\n

    after the shift that removes the 's/foo/bar/' I am then left with the @ARGV that contains just the file names. The script then loops through all of them and works it's magic, so far so good.

    What I do not understand is how the script determines which files to put into @ARGV. At which point is "*.txt" being evaluated? Or is it the shell that tells the script which files in the directory match the pattern *.txt?

    This is probably a daft question but any help is much appreciated, as I said I am very new to this.


    Ta Arian

      In any UNIX-like shell, when you specify a filename glob (i.e. a filename with wildcards) on the shell commandline, it is the shell, and not the program you are calling, which expands the glob.

      So yes, when you call

      rename.pl *.txt
      the shell expands the *.txt to all matching files, and the Perl script finds the already expanded args in its @ARGV.

      If you want the script to do the glob expansion, you'd have to enclose the argument in single quotes, i.e. call it like this:

      somescript.pl '*.txt'

      Then the Perl script finds exactly one arg in @ARGV, namely *.txt and you would have to find some way to do the expansion.

      With Larry's script however, I typically feed it ALL the files by matching *; it will only act on those files which max the regexp given as the first argument anyway, all others are skipped. Of course that requires some care in constructing the regexp.

      @ARGV would be an Array with four entries like this right?
      Yes, except for the new lines, \n. Prove this to yourself with a test script:
      use strict; use warnings; print "These are the ", scalar @ARGV, " arguments:\n"; my $i = 0; for (@ARGV) { print "ARGV[$i] = $_\n"; $i++; }

      prints:

      These are the 4 arguments: ARGV[0] = s/foo/bar/ ARGV[1] = file1.txt ARGV[2] = file2.txt ARGV[3] = file3.txt

      Alternately, you can use Data::Dumper:

      use strict; use warnings; use Data::Dumper; print Dumper(\@ARGV);

      prints:

      $VAR1 = [ 's/foo/bar/', 'file1.txt', 'file2.txt', 'file3.txt' ];
      What I do not understand is how the script determines which files to put into @ARGV. At which point is "*.txt" being evaluated? Or is it the shell that tells the script which files in the directory match the pattern *.txt?
      The shift built-in function removes element 0 from the @ARGV array, shifting the remaining elements [1:3] down to new positions [0:2]. Again, here is a simple test script:
      use strict; use warnings; my $op = shift; # same as: my $op = shift @ARGV; print "These are the ", scalar @ARGV, " arguments:\n"; my $i = 0; for (@ARGV) { print "ARGV[$i] = $_\n"; $i++; }

      prints:

      These are the 3 arguments: ARGV[0] = file1.txt ARGV[1] = file2.txt ARGV[2] = file3.txt
        Hi toolic,

        thank you for your quick reply. I understand what shift does in this script. What I don't understand is how all txt files in the directory end up in @ARGV.

        When I copy and paste your code, what I get is:

        These are the 2 arguments: ARGV[0] = s/\.txt\.mex// ARGV[1] = *.txt

        And this also makes sense to me, because how would the script be able to know which files in the directory match '*.txt'? Doesn't there have to be another process that matches '*.txt' against all files in the directory first?

        Arian