ishaqali has asked for the wisdom of the Perl Monks concerning the following question:

I have a bunch of files (about 7000 of them). I have to traverse the directory structure, read each of the file and identity a section of the text. For example the text between '<!- Begin Replacable Section-->' and '<!- End Replacable Section-->'. The number of lines between them varies. I have to replace the section of the files between two comments with a different text. I am relatively new to perl. I can traverse the directory. I can construct the replacable section. I can open the file, but I do not know how to replace the section between two comment lines. I would greatly appreciate any help on this.
  • Comment on Modifying A Bunch of Files by replacing specific lines

Replies are listed 'Best First'.
Re: Modifying A Bunch of Files by replacing specific lines
by robartes (Priest) on Feb 20, 2003 at 08:23 UTC
    Here's an example as a first shot at this:
    #!/usr/local/bin/perl # repl.pl use strict; open INPUT, "<input.txt" or die "Blerch: $!\n"; open OUTPUT, ">output.txt" or die "Hcrelb: $!\n"; my $replace_string="Hi there.\n"; my $flag=0; my %markers=("BEGIN" => "<!- Begin Replacable Section-->", "END" => "<!- End Replacable Section-->", ); while (<INPUT>) { if ($flag) { next unless /$markers{END}/; $flag=0; print OUTPUT $replace_string; } print OUTPUT; $flag=1 if /$markers{BEGIN}/; } __END__ > cat input.txt But now I know that twenty centuries of stony sleep were vexed to nightmare by a rocking cradle. And what rough beast, it's hour come round at last, slouches towards Betlehem to be born? <!- Begin Replacable Section--> A poem <!- End Replacable Section--> W. B. Yeats > repl.pl ; cat output.txt But now I know that twenty centuries of stony sleep were vexed to nightmare by a rocking cradle. And what rough beast, it's hour come round at last, slouches towards Betlehem to be born? <!- Begin Replacable Section--> Hi there. <!- End Replacable Section--> W. B. Yeats

    CU
    Robartes-

      You'll probably want to do this:

      next unless /\Q$markers{END}\E/;

      and the same for the other pattern match. This ensures that your markers string gets seen as a literal string to match, not as a pattern itself. Especially if you are using < and > which tend to have meanings within regexes.

      dave hj~

Re: Modifying A Bunch of Files by replacing specific lines
by BrowserUk (Patriarch) on Feb 20, 2003 at 10:05 UTC

    This assumes that your sections start and end on seperate lines...if that's not the case an slightly different version is required. See perlman:perlrun for the details of the switches.

    On a *nix system you don't need the first line of the BEGIN{} block

    #! perl -snli.org use strict; use vars qw[$B $E $R $s]; BEGIN{ @ARGV= map{glob}@ARGV; # Not necc. on *nix $s=0 } /<!- $B -->/ .. (/<!- $E -->/ && ($s = 1)) ? ($s and print($R), $s = 0 ) : print

    sample usage

Re: Modifying A Bunch of Files by replacing specific lines
by dash2 (Hermit) on Feb 20, 2003 at 10:49 UTC
    If you need to know how to do the replacement on the string, look at perlre. You could do s/pattern/string/s; but if so, as your pattern match will span multiple lines, you will need the whole file as a single string, which can use up memory on big files.

    The alternative is the suggestion by the other poster - run through each line and use a flag to tell you whether you are in a replaceable section.

    The more awkward problem is how to replace the text in the file. Again, it is perfectly valid to just open an INPUT filehandle and an OUTPUT filehandle, then to rename the new file to overwrite the old one. However, that is quite complex. A neater idea might be to write a shell script and use some of perl's command line switches. (See perlrun for details.) Something like:

    find . -name *.ext -print | xargs perl -pi.bk -e \ ' if ($flag && /end_match/) {$flag = 0;} if ($flag) { $sub = $printed++? '': 'My text to substitute'; s/^.*$/$sub/; } if (/start_match/) {$flag++;$printed = 0;} '
    That's untested but you get the general idea. Find finds and prints the relevant files. xargs takes the filename and passes it to the end of the perl command. perl reads the filename, prints a copy to filename.bk (you can remove the .bk once you are sure your pattern works), and alters the original file.

    No good if you're on windows though...

    dave hj~

Re: Modifying A Bunch of Files by replacing specific lines
by zby (Vicar) on Feb 20, 2003 at 09:45 UTC
Re: Modifying A Bunch of Files by replacing specific lines
by runrig (Abbot) on Feb 20, 2003 at 17:56 UTC
    This assumes your markers are the only text on the line (or at least the only text you care about). You'll have to adjust and do some extra 's///' if that's not the case:
    use File::Temp qw(tempfile); use Cwd; sub replace_file { my ($file, $start, $end, $replace_text) = @_; local (*ARGV, $_, $.); @ARGV = $file; my ($fh, $tmp_file) = tempfile(DIR=>cwd); my $replaced; while (<>) { my $replace = /\Q$start\E/../\Q$end\E/; print $fh $_ and next unless $replace; $replaced=1; print $fh $replace_text if $replace == 1; } close $fh; return unless $replaced; rename $tmp_file, $file or warn "Error replacing $file: $!"; }
    Update: upon closer inspection, I see my answer is similar to BrowserUK's above, though my answer also deals with actually replacing the file, not just outputting the replaced text, and should work within a File::Find solution.
      Thank you all for the great help. The markers are not the only text on the line. For example one of the file has these lines: "<\td class="health" height="99%"> <-- #BeginEditable "category" class="category" -->MEDICAL {carriage return} BENEFITS</\td></\tr> <\tr valign="top"> <\td height="99%" class="mainbody"> " (The extra slashes are to display the html code as is) The beginning marker for this is <-- #BeginEditable "category" class="category" --> and the ending marker for this is <-- #EndEditable --> i.e. I have to discard everthing after the beginning marker but not the marker or any thing before itself. I have to discard everything before the end marker but not the end marker or anything after that. I think I will try figuring this out myself. If anybody can find some time to post the code, I would be greatful. Thank You for the great help.