in reply to XML::Twig - literal nodes
This is an XML FAQ: if you want to include unstructured text that can include anything, including < and & characters, then you can use CDATA sections:
<doc> <p>regular text here, > needs to be escaped as <</p> <literal><![CDATA[here you can use < and & and whatever you want]]>< +/literal> <literal><![CDATA[this is how you include the CDATA end mark ]]]]><![CDATA[> by spliting it into 2 different CDATA sections]]></literal> </doc>
Note that the CDATA section has no effect on the element structure. In fact it is just a convenience that allows you not to have to escape every single instance of < and & (and " or ' in attributes).
BTW, you probably want to generate HTML from a CDATA section (which would be your next question ;--), even though I don't think browsers support them. It is pretty easy: all you have to do is turn them into regular PCDATA and print them, all special characters will then be escaped!:
#!/bin/perl -w use strict; use XML::Twig; my $t= XML::Twig->new( ); $t->parse( \*DATA); foreach my $cdata ( $t->descendants( '#CDATA')) { $cdata->set_pcdata( $cdata->cdata); $cdata->set_gi( '#PCDATA'); } $t->print; __DATA__ <doc> <p>regular text here, < needs to be escaped as &lt;</p> <literal><![CDATA[here you can use < and & and whatever you want]]>< +/literal> </doc>
updated 2005-05-04: a ]]> was missing from the last CDATA. Thanks to ambrus for pointing this out.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: XML::Twig - literal nodes
by John M. Dlugosz (Monsignor) on Nov 08, 2001 at 21:12 UTC | |
by mirod (Canon) on Nov 08, 2001 at 22:10 UTC | |
by John M. Dlugosz (Monsignor) on Nov 09, 2001 at 00:18 UTC |