PyrexKidd has asked for the wisdom of the Perl Monks concerning the following question:

I have a perl script that opens a file and parses thru a file replacing a search string. I need to modify the code to take a list of file names and then parse thru each file. Instead of taking the file name from the command line.
#!/usr/bin/perl use strict; use warnings; my($copy) = "$ARGV[0].bak"; copy($ARGV[0], $copy) or die "File cannot be copied. \n"; open(INPUT,"$copy") or die 'Cannot open file: $!\n'; open(OUTPUT,">$ARGV[0]"); my($replacePattern1) = "foo1"; my($searchPattern1) = "bar1"; my($replacePattern2) = "foo2"; my($searchPattern2) = "bar2"; my($replacePattern3) = "foo3"; my($searchPattern3) = "bar3"; my($pattern) = 0; while (<INPUT>) { if ($_ =~ s/$searchPattern1/$replacePattern1/g) { $pattern = 1; } elsif ($_ =~ s/$searchPattern2/$replacePattern2/g) { $pattern = 2; } elsif ($_ =~ s/$searchPattern3/$replacePattern3/g) { $pattern = 3; } print OUTPUT $_; }close INPUT; close OUTPUT; unlink($copy);
how do I modify this code?

Replies are listed 'Best First'.
Re: Find and Replace from File List
by toolic (Bishop) on May 28, 2010 at 01:08 UTC
    One way is to read your file list into an array, then loop over each file performing the substitutions. UNTESTED:
    use strict; use warnings; use File::Copy; my($replacePattern1) = "foo1"; my($searchPattern1) = "bar1"; my($replacePattern2) = "foo2"; my($searchPattern2) = "bar2"; my($replacePattern3) = "foo3"; my($searchPattern3) = "bar3"; my $pattern = 0; my $list = 'list.txt'; open my $fh, '<', $list or die "can not open $list: $!"; my @files = <$fh>; close $fh; chomp @files; for my $file (@files) { my $copy = "$file.bak"; copy($file, $copy) or die "File $file cannot be copied: $!\n"; open my $fhi, '<', $copy or die "can not open $copy: $!"; open my $fho, '>', $file or die "can not open $file: $!"; while (<$fhi>) { if (s/$searchPattern1/$replacePattern1/g) { $pattern = 1; } elsif (s/$searchPattern2/$replacePattern2/g) { $pattern = 2; } elsif (s/$searchPattern3/$replacePattern3/g) { $pattern = 3; } print $fho $_; } close $fho; close $fhi; unlink $copy; }
Re: Find and Replace from File List
by dineed (Scribe) on May 28, 2010 at 06:24 UTC

    I find the handling of the files curious:

    1. The original input file ($ARGV[0]) is copied to *.bak
    2. The *.bak file is then opened as read only (input) and original file is opened in write mode (to be overwritten)
    3. The *.bak file (input) is deleted after the original ($ARGV[0]) file has been modified.

    At the end of the program, you no longer have the original input data. This seems dangerously odd. Is this the correct and intended behavior?

      Actually the odd thing about this is use of .bak to save the original while working on the original copy instead of using original.new as a scratch pad to work on the new version!

      One consideration is to the extent possible, leave the system in a "known state" even if whole O/S bombs or program bombs, meaning even some function which the modification program called "bombed" and process was killed.

      So in general, I would recommend making a copy of the original file and modify that "new copy" until we are happy with it. Leave the original alone! The last thing to do is to close the ".new" copy, unlink the original, then rename the ".new" version to the original name. The idea is to make the "time window" where the machine is in an unknown state (between "old" and "new") as small as possible.

      The "yeah, butt's" with this are legion. On some O/S's, I can unlink a file, put a new file with same name in directory and folks that have that say old.dll open continue to use the old version because once a file is open, the name is irrelevant...the program is using a file handle, not a name in a directory. New programs that start will get new.dll. A good Windows installation program will register what is called a "run-once" program. This program runs if the OS reboots during the install to clean things up. The installer will remove this "run-once" deal after all it wanted to do was successful (so it doesn't run on the next re-boot).

      Already talked too long about this, but there are some definite "yeah, butt's" to be considered when modifying files. I would move "old-previously-modified" files to some area where they are just history of no real consequence after some period of time. And then periodically delete them.

      Hi, What perldoc is that in?..as the perldoc's i've looked at don't say anything about deleting original file. thanks.
      the hardest line to type correctly is: stty erase ^H
Re: Find and Replace from File List
by Marshall (Canon) on May 28, 2010 at 03:25 UTC
    I am not sure what you are trying to accomplish. But I do find the following code rather strange. $pattern actually makes no difference here because the value of $pattern is never used in a statement related to output. I figure you could delete all lines involving $pattern.
    my($pattern) = 0; while (<INPUT>) { if ($_ =~ s/$searchPattern1/$replacePattern1/g) { $pattern = 1; } elsif ($_ =~ s/$searchPattern2/$replacePattern2/g) { $pattern = 2; } elsif ($_ =~ s/$searchPattern3/$replacePattern3/g) { $pattern = 3; } print OUTPUT $_; }

      $pattern was used as part of another script.

      I did remove it as it's unnecessary for this code.

      Thanks for the advice.

Re: Find and Replace from File List
by aquarium (Curate) on May 28, 2010 at 03:32 UTC
    Depending on your requirements you might even get away with using the -i switch to perl and read <> which will magically supply any files specified as arguments to your script. when you print to stdout isntead of a filehandle, it will write to the new version of the file. the relevant documentation is "perldoc perlrun".
    the hardest line to type correctly is: stty erase ^H
Re: Find and Replace from File List
by baxy77bax (Deacon) on May 28, 2010 at 10:19 UTC
    an alternative also is, if you prefer to use as much as less modules possible:
    #!/usr/bin/perl use strict; no warnings; foreach my $in (@ARGV){ my($copy) = "$in.bak"; copy($in, $copy) or die "File cannot be copied. \n"; open(INPUT,"$copy") or die 'Cannot open file: $!\n'; open(OUTPUT,">$in"); my($replacePattern1) = "foo1"; ... }
    or if the files are in a dir then use the readdir function, collect all the files from the dir and loop over them like in the example before
    opendir(DIR, $ARGV[0]) || die "i died for no reason ;)\n"; foreach (readdir DIR){ #repeat like above #watch out for defining the right path to dir and when looping re-ty +pe the fool path to your dir, like: # open (FILE, ">", "$ARGV[0]/$_") || die "i died for no reason ;)\n" }