The problem is correctly identified in Re^3: Unexpected results from a regex replacement (++). You are running the regexp on the $outdata every time you add a line to it.

The reason the above regexp works is it doesn't look for "<cfmail", it looks for "<cfmail" that isn't preceded by a "<!--" comment tag. Consider the following:

my $outdata_v1 = ""; my $outdata_v2 = ""; my $data_offset = tell DATA; my $line_count = 1; print "First Regexp solution\n"; print "-"x20, "\n"; while ( <DATA> ) { $outdata_v1 .= $_; print "outdata for read of line $line_count before:\n$outdata_v1\n"; $outdata_v1 =~ s{<cfmail}{<!--- <cfmail}g; $outdata_v1 =~ s{</cfmail>}{</cfmail> --->}g; print "outdata for read of line $line_count after:\n$outdata_v1\n"; $line_count++; } #-- reset it all, start again with the better regexp. seek( DATA, $data_offset, 0); $line_count = 1; print "Second Regexp solution\n"; print "-"x20, "\n"; while ( <DATA> ){ $outdata_v2 .= $_; print "outdata for read of line $line_count before:\n$outdata_v2\n"; $outdata_v2 =~ s{(?<!<!--- )<cfmail}{<!--- <cfmail}g; $outdata_v2 =~ s{</cfmail>(?! --->)}{</cfmail> --->}g; print "outdata for read of line $line_count after:\n$outdata_v2\n"; $line_count++; } __DATA__ <cfmail to="#to_address#"> </cfmail> <cfmail to="#to_address_2#">
The output is:
First Regexp solution -------------------- outdata for read of line 1 before: <cfmail to="#to_address#"> outdata for read of line 1 after: <!--- <cfmail to="#to_address#"> outdata for read of line 2 before: <!--- <cfmail to="#to_address#"> </cfmail> outdata for read of line 2 after: <!--- <!--- <cfmail to="#to_address#"> </cfmail> ---> outdata for read of line 3 before: <!--- <!--- <cfmail to="#to_address#"> </cfmail> ---> <cfmail to="#to_address_2#"> outdata for read of line 3 after: <!--- <!--- <!--- <cfmail to="#to_address#"> </cfmail> ---> ---> <!--- <cfmail to="#to_address_2#"> Second Regexp solution -------------------- outdata for read of line 1 before: <cfmail to="#to_address#"> outdata for read of line 1 after: <!--- <cfmail to="#to_address#"> outdata for read of line 2 before: <!--- <cfmail to="#to_address#"> </cfmail> outdata for read of line 2 after: <!--- <cfmail to="#to_address#"> </cfmail> ---> outdata for read of line 3 before: <!--- <cfmail to="#to_address#"> </cfmail> ---> <cfmail to="#to_address_2#"> outdata for read of line 3 after: <!--- <cfmail to="#to_address#"> </cfmail> ---> <!--- <cfmail to="#to_address_2#">
You can see that your original regexp (as Eimi Metamorphoumai correctly pointed out), runs on every line in your file for each line in the file, adding a new comment flag every time. The second regexp solution does not add a new comment every time, since it is constructed to look for cfmail flags that are not preceded by a comment.


In reply to Re^3: Unexpected results from a regex replacement by jimbojones
in thread Unexpected results from a regex replacement by yacoubean

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.