First of all, a ++ is in order because you shown another approach to the problem.
I guess your example might not work as expected, because the regexp is greedy, so if the XML contains more than one CDATA sections, only the first will be seen. I made slight adjustments as shown below...
#!/usr/bin/perl
use strict;
use warnings;
my $cdata = join('', <DATA>);
while ($cdata =~ s/<!\[CDATA\[(.*?)\]\]>/$1/ms)
{
print "cdata = $1\n";
}
__DATA__
<?xml version="1.0" encoding="UTF-8" ?>
<![CDATA[This is the first cdata]]>
<![CDATA[This is the second cdata]]>
<some/>
So that it outputs...
cdata = This is the first cdata
cdata = This is the second cdata
Also, probably hating dot star is in order, but I can't think about a better regexp right now :) .
Best regards
-lem, but some call me fokat |