in reply to Re: "CDATA Parsing and XML"
in thread "CDATA Parsing and XML"

First of all, a ++ is in order because you shown another approach to the problem.

I guess your example might not work as expected, because the regexp is greedy, so if the XML contains more than one CDATA sections, only the first will be seen. I made slight adjustments as shown below...

#!/usr/bin/perl use strict; use warnings; my $cdata = join('', <DATA>); while ($cdata =~ s/<!\[CDATA\[(.*?)\]\]>/$1/ms) { print "cdata = $1\n"; } __DATA__ <?xml version="1.0" encoding="UTF-8" ?> <![CDATA[This is the first cdata]]> <![CDATA[This is the second cdata]]> <some/>

So that it outputs...

cdata = This is the first cdata cdata = This is the second cdata

Also, probably hating dot star is in order, but I can't think about a better regexp right now :) .

Best regards

-lem, but some call me fokat