Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

i've got the source code from an html document read into $pagecode, and the code should replace a portion surrounded by special comment tags with $content, after which the source is written back to the html document. now, when i run the script for some reason, nothing is changed between the comment tags...any idea why?
$pagecode =~ s/<!--BeginContent-->(.*)<!--EndContent-->/<!--BeginConte +nt-->$content<!--EndContent-->/; $pagecode =~ s/<!--BeginPages-->(.*)<!--EndPages-->/<!--BeginPages-->< +!--EndPages-->/;

Replies are listed 'Best First'.
Re: regexing my brain...
by ysth (Canon) on Nov 17, 2003 at 03:59 UTC
    Just a guess - do you need to add the s flag (e.g. s/foo/bar/s;) so that (.*) will include newline characters?
      YES!!!! thankyouthankyouthankyouthankyouthankyouthankyouthankyou!!!
Re: regexing my brain...
by sgifford (Prior) on Nov 17, 2003 at 08:16 UTC
    Something else to keep in mind is that .* is greedy, so if you have:
    <!--BeginContent-->Content 1 Here<!--EndContent--> <!--BeginContent-->Content 2 Here<!--EndContent-->
    it will match all of it, not just the first one. Especially when you're doing multi-line matching, that can bite you sometimes.
Re: regexing my brain...
by BUU (Prior) on Nov 17, 2003 at 04:07 UTC
Re: regexing my brain...
by Roger (Parson) on Nov 17, 2003 at 04:15 UTC
    I suspect that the text you want to replace contains '\n' characters. I have created the following code to inspect your code -

    #!/usr/local/bin/perl -w use strict; my $pagecode; { local $/; $pagecode = <DATA>; } my $content = "Roger"; print "before:\n---------\n$pagecode\n"; # fix 1 $pagecode =~ s/<!--BeginContent-->(?:.|\n)*<!--EndContent-->/<!--Begin +Content-->$content<!--EndContent-->/; # fix 2 $pagecode =~ s/<!--BeginPages-->.*<!--EndPages-->/<!--BeginPages--><!- +-EndPages-->/s; print "after:\n---------\n$pagecode\n"; __DATA__ <!--BeginContent-->Hello World! This is a comment line <!--EndContent--> <!--BeginPages-->Hello World! <!--EndPages-->
    Note that there are many ways to fix this problem. By the way, you can drop the capture ( .. ) in your regexp too.

    The first method - replace (.*) in your regexp with (?:.|\n)*.

    The second method - add the 's' switch at the end of the regexp.