schnarff has asked for the wisdom of the Perl Monks concerning the following question:

I'm attempting to write a command-line utility that will allow users to search & replace blocks of text over multiple files. The first step in this, obviously, is getting my list of matching files.

Users are directed to input a line such as "replacer -i *.html" to tell the program what base of files to start with. However, when I use getopts() to grab the *.html, it only gives me the first item. After placing single quotes around my command-line argument, I get the literal argument.

This, I'm afraid, is not working well with my method of extracting the proper files out of a directory listing, as I get a regex error when I do a $x =~ /$opt_i/.

What's the best way of extracting the list of files that match a simple expression such as *.html? I'm sure there's some good trick to this, as people must have to use this all the time.

Thanks,
Alex Kirk

Replies are listed 'Best First'.
Re: Getting files matching pattern (i.e. *.html)
by Zaxo (Archbishop) on May 13, 2002 at 04:40 UTC

    See glob or its cousin the diamond operator <> in perlop.

    After Compline,
    Zaxo

Re: Getting files matching pattern (i.e. *.html)
by tadman (Prior) on May 13, 2002 at 04:44 UTC
    You are correct in that you will either have to use quotes to import the literal file specification "*.html". You could also let the shell do it for you, but pass this list in using without using '-i', and use your program as "replacer *.html" with the files being specified explicitly.

    You can always expand your -i parameter:
    my @opt_i = glob($opt_i);
    This is not recursive, though. For that, you might have to use something like File::Find which means converting your shell-style glob into a regex, or for a quick and dirty hack, just use the output of find.
    my @opt_i = map { chomp; $_ } `find $opt_i`;
    Which will work as well. Note that this is kind of crazy, because you are using tainted input which is being passed to the command line. This can be dangerous if the program is being run with privileges that the user shouldn't have, such as via a Web page, or a "suid" script.

    As an alternative, you could just use Perl to do your dirty work for you.
    % perl -pi -e 's/foo/bar/' `find -name '*.html'`
    This simple substitution method could be turned into a shell script to reduce user error.
Re: Getting files matching pattern (i.e. *.html)
by Juerd (Abbot) on May 13, 2002 at 08:35 UTC

    Users are directed to input a line such as "replacer -i *.html" to tell the program what base of files to start with. However, when I use getopts() to grab the *.html, it only gives me the first item.

    This is because your shell interpolates. Try this one to see what happens:

    perl -MData::Dumper -e'print Dumper(\@ARGV)' *
    It is not done by Perl (might be on non-unices, I don't know about that), as you can see when you use echo(1):
    echo *

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

      as a point of reference, for those who might not be able to test on various platforms... solaris shell interpolates, as does cygwin. windows does not. expect difficulties in porting if you use code like this.

      ~Particle *accelerates*

Re: Getting files matching pattern (i.e. *.html)
by dsheroh (Monsignor) on May 13, 2002 at 15:50 UTC
    My version of this wheel:
    #!/usr/bin/perl -w use strict; my $cmd = shift; while (my $filename = shift) { # Perform operation to new file. Exit on error. # TODO: On error, report command that caused problem and the result +ing # error message before dying. die if `$cmd $filename 2>&1 >$filename.new`; # If the output is different than the input, replace the old version + with # the new one. If nothing was changed, discared the new version and + leave # the old one untouched. if (`diff $filename.new $filename`) { rename "$filename.new", $filename; } else { unlink "$filename.new"; } }
    I call it 'doall' as in doall "sed s/oldtext/newtext/g" *.html . A bit more flexible than what you appear to be looking for, but, if you hardcode $cmd to your sed command instead of reading it from the command line...
      Wow! I'm overwhelmed by the positive response here. I'm going to be sorting through these methods tonight and figuring out which one is the best. I'm sure one of them will work nicely. :-)

      Thanks for all of your help.

      Alex Kirk