- Don't parse (arbitrary) XML with a regex; you're asking for a world of hurt.
- If you're going to parse a relatively fixed XML input (a one off; a very static source that's not going to change how it creates its XML output), you most likely want to use non-greedy modifiers because it's extremely easy to get something that "looks" like it'll work but is brittle (as has already been recommended elsewhere in this thread).
$ cat foo.pl
use strict;
use warnings;
my $string = qq{<Prompt><![CDATA[Please ... question ). ]]> <![CDATA[[
+ FAIL. ]]></Prompt>};
my( $fail ) = $string =~ m{ \[CDATA\[ (.*) ]]> }xms;
print $fail, "\n";
my( $better ) = $string =~ m{ \[ CDATA \[ (.*?) \]\] }x;
print $better, "\n";
$ perl foo.pl
Please ... question ). ]]> <![CDATA[[ FAIL.
Please ... question ).
Addendum: I mean it's not like this topic gets discussed every other week around here or something . . .
|