in reply to Special Variables in Regular Expression

The ?? at the end of your regex looks a little suspicious. One question mark would mean "match the string <?query?> zero or one times". Do you really want the 2nd question mark un-escaped there?

It would have been more helpful if you had provided an example of the $data_in string. Is this what you had in mind? ...

use strict; use warnings; my $data_in = '<container><a>bbb</a><b>cccc</b></container><?query?>'; my $data_out; while ($data_in =~ /<container><a>(.+?)<\/a><b>(.+?)<\/b><\/container> +(<\?query\?>)?/ig) { print "1=$1\n"; print "2=$2\n"; print "3=$3\n"; $data_out = "$2, $1$3 "; } __END__ 1=bbb 2=cccc 3=<?query?>
Of course there is probably a much simpler way of achieving this.
I agree with Fletch that you probably would be better off using a CPAN parser than your own regex solution.

Update: Another general note on regexes. You can use alternate delimiters in order to avoid excessive escaping of forward slashes. For example, you can replace // with m{}:

while ($data_in =~ m{<container><a>(.+?)</a><b>(.+?)</b></container>(< +\?query\?>)?}ig) {

Replies are listed 'Best First'.
Re^2: Special Variables in Regular Expression
by Anonymous Monk on Jul 25, 2008 at 15:58 UTC
    The lazy modifier on the regex atom at the very end of the pattern -- the (...)?? -- means that the final capture is never needed to make the overall regex match, so the corresponding capture variable (i.e., $3) will never be defined.