freakingwildchild has asked for the wisdom of the Perl Monks concerning the following question:

Hey Monks ... Prayers might be too late for me since I've had the inevitable, after coding my web application with XML::Simple I came to a full halt. I needed to use CDATA fields to encapsulate user-changable HTML data in its XML configuration files. Not only CDATA was needed but namespace support; for languages and itunes podcasting.

Since I've had incorporated XML::Simple in a lot of my code it would be a major pain if I had to change all my code again to support XML::DOM, XML::Twig or XPATH; just because of those two missing features in XML::Simple.

I've chosen TWIG because it looked the most natural to XML::Simple in a way (cope with me here) ; and ; I had nice example code to start from from the developer; although; I am a little bit stuck now; since I want a CDATA field to be "automagically CDATA" and a normal textfield to be normally TEXT without using DTD's.

I was thinking about two solutions; one would be naming my tag outline.cdata or something like that and to autodetect for the presence of such tag; or; the second solution would be to create/find a "html detection routine" which will automatically see if the code does not fit the XML standards to be inbetween normal tags without CDATA and add such field as CDATA; although; both ways I find pretty much spaghetti-coding and probably also a hog.

updated node: What I want is literally quite "XML::simple"; I want XML::Twig to use a normal Arrayref just like XML::simple does; instead of manipulating the entire structure; which is easy with the "simplify" parameter to read, but not so easy to write (to my knowledge?). Add to that, the cherry on top, namespace support (any ideas?)

The reasons: less memory, faster, easier and it's usable with my code which is based on XML::Simple. I am also sure other people might be wanting this when they want CDATA and namespace support; which XML::Twig supports; but not in that "XML::Simple" way ...

use XML::Twig; use Data::Dumper; # let's emulate XMLin even more simple (for now) sub XMLTwin { if (-e $_[0]) { return XML::Twig->new->parsefile($_[0])->simplify( k +eyattr => []); } } # let's emulate XMLout even more simple than XML::Simple (for now) sub XMLTwout { my $elt; my $twig = XML::Twig->new(pretty_print => 'indented', empty_tags = +> 'html'); my $xmltag = "opt"; sub create_element { # needs to be a internal subroutine only for XM +LTwout... my $gi = shift; my $data = shift; my $t = XML::Twig::Elt->new($gi); if (ref $data) { while (my ($k,$v) = each(%$data)) { if ($k ne "outline") { create_element($k, $v)->paste(last_ch +ild => $t); } else { $t->insert_new_elt( last_child => $k => { '#CDATA' +=> 1 }, $v); } } } else { $t->set_text($data); } $t; } if ($_[1]) { $xmltag = $_[1] }; $elt = create_element($xmltag => $_[0]); $twig->set_root($elt); undef($elt); # let's clear some memory here print Dumper($elt); return $twig->sprint(); } # some test code ; my $file = 'test.xml'; my $xmldata = XMLTwin($file); # let's get it in my $xmlout = XMLTwout($xmldata); # let's get it out ... print $xmlout;
Now, still, my questions remain:
  1. How could I best "auto-detect" for html code which I do not want escaped or automatically define a declaration to CDATA without breaking the "simple arrayref" ? I thougth about a ${example}->{'tagname:cdata'} solution would be "best".
  2. would it be possible for fields that have html but need multi-language support? ex: ${main}->{'outline:cdata:en'} = "HTML CODE" ? would I need to make my own checking routine then which will puts the ${main} to its seperate array extracting the tag(s)/optional cdata/optional language field ?
  3. How can I put the internal create_element only to be used by XMLTwout ? since it also needs to call itself..
  4. will I ever get a life or eternal happyness? ;)

Replies are listed 'Best First'.
Re: XML::Simple functionality with XML::Twig ?
by saberworks (Curate) on Jun 15, 2006 at 15:42 UTC
    XML::Simple does namespaces quite nicely, just use the NSExpand => 1 option. Also, what makes you think it doesn't handle CDATA fields? I can't find anything in the docs about that.
      I didn't find any reference to XMLOut which supports CDATA and namespace support in writing. Reading is all ok; but writing?

      Also I tried namespaces with 'author:itunes' as test but didn't really work quite well ; how are namespaces supported then in reading (and writing) through XML::Simple? $xml->{'author:itunes'} ? (I already tried and didn't work)

      Namespaces/CDATA writing support is the main reason I am going towards XML::Twig; but if that is not needed and I get intending only as "extra" for my application; I'd rather stay with XML::Simple and its memory requirements against to XML::Twig.

        I've personally only used the namespace support using XMLIn, but the docs specify that NSExpand works on XMLOut, as well. Did you properly set the xmlns attribute for the tag? That's the only way it will work, I guess.

        Re: CDATA, I'm not sure about writing, I actually moved to xml::libxml for writing because it was easier.
for future reference - old code
by freakingwildchild (Scribe) on Jun 15, 2006 at 17:35 UTC
    here is the older code for future reference
    use XML::Twig; # reading routines ; same way as xml::simple; it puts everything in a +easy to use arrayref. my $xmldata = XML::Twig->new->parsefile('test.xml')->simplify( keyattr + => []); # writing routines ; same way as xml::simple; # write the array to xml code and of'course get the indented for free +;) my $twig = XML::Twig->new(pretty_print => 'indented', empty_tags = +> 'html'); my $elt = create_element(xml => $xmldata); undef($xmldata); $twig->set_root($elt); $twig->print(); # this output undef($twig); sub create_element { my $gi = shift; my $data = shift; my $t = XML::Twig::Elt->new($gi); if (ref $data) { while (my ($k,$v) = each(%$data)) { if ($k ne "outline") { # if the field is called "outline" it will be automatically put to # CDATA else it will return normal escaped data. create_element($k, $v)->paste(last_child => $t); } else { $t->insert_new_elt( last_child => $k => { '#CDATA' => 1 }, + $v); } } } else { $t->set_text($data); } $t; }