in reply to Re^2: Unexpected results from a regex replacement
in thread Unexpected results from a regex replacement

The problem is correctly identified in Re^3: Unexpected results from a regex replacement (++). You are running the regexp on the $outdata every time you add a line to it.

The reason the above regexp works is it doesn't look for "<cfmail", it looks for "<cfmail" that isn't preceded by a "<!--" comment tag. Consider the following:

my $outdata_v1 = ""; my $outdata_v2 = ""; my $data_offset = tell DATA; my $line_count = 1; print "First Regexp solution\n"; print "-"x20, "\n"; while ( <DATA> ) { $outdata_v1 .= $_; print "outdata for read of line $line_count before:\n$outdata_v1\n"; $outdata_v1 =~ s{<cfmail}{<!--- <cfmail}g; $outdata_v1 =~ s{</cfmail>}{</cfmail> --->}g; print "outdata for read of line $line_count after:\n$outdata_v1\n"; $line_count++; } #-- reset it all, start again with the better regexp. seek( DATA, $data_offset, 0); $line_count = 1; print "Second Regexp solution\n"; print "-"x20, "\n"; while ( <DATA> ){ $outdata_v2 .= $_; print "outdata for read of line $line_count before:\n$outdata_v2\n"; $outdata_v2 =~ s{(?<!<!--- )<cfmail}{<!--- <cfmail}g; $outdata_v2 =~ s{</cfmail>(?! --->)}{</cfmail> --->}g; print "outdata for read of line $line_count after:\n$outdata_v2\n"; $line_count++; } __DATA__ <cfmail to="#to_address#"> </cfmail> <cfmail to="#to_address_2#">
The output is:
First Regexp solution -------------------- outdata for read of line 1 before: <cfmail to="#to_address#"> outdata for read of line 1 after: <!--- <cfmail to="#to_address#"> outdata for read of line 2 before: <!--- <cfmail to="#to_address#"> </cfmail> outdata for read of line 2 after: <!--- <!--- <cfmail to="#to_address#"> </cfmail> ---> outdata for read of line 3 before: <!--- <!--- <cfmail to="#to_address#"> </cfmail> ---> <cfmail to="#to_address_2#"> outdata for read of line 3 after: <!--- <!--- <!--- <cfmail to="#to_address#"> </cfmail> ---> ---> <!--- <cfmail to="#to_address_2#"> Second Regexp solution -------------------- outdata for read of line 1 before: <cfmail to="#to_address#"> outdata for read of line 1 after: <!--- <cfmail to="#to_address#"> outdata for read of line 2 before: <!--- <cfmail to="#to_address#"> </cfmail> outdata for read of line 2 after: <!--- <cfmail to="#to_address#"> </cfmail> ---> outdata for read of line 3 before: <!--- <cfmail to="#to_address#"> </cfmail> ---> <cfmail to="#to_address_2#"> outdata for read of line 3 after: <!--- <cfmail to="#to_address#"> </cfmail> ---> <!--- <cfmail to="#to_address_2#">
You can see that your original regexp (as Eimi Metamorphoumai correctly pointed out), runs on every line in your file for each line in the file, adding a new comment flag every time. The second regexp solution does not add a new comment every time, since it is constructed to look for cfmail flags that are not preceded by a comment.