rkg has asked for the wisdom of the Perl Monks concerning the following question:

Hi --

I am new to XML::Twig. A neat module indeed.

I have a question about stream vs. in-memory processing.

I am using Twig to change the form of an XML document. The DTD is complex and the document is large; here's an abstraction of the base document.

#### HAND EDITED, NOT TESTED <a> <b name="funny words"> <c name="foo"/> <c name="baz"/> </b> <b name="foods"/> <c name="apple"/> <c name="pear"/> <c name="cheese"/> </b> </a>
I want to promote each "C" into its own "B", cloning the parent and making each kid an only-child, yielding something like this
#### HAND EDITED, NOT TESTED <a> <b name="foo"> <c name="foo"/> </b> <b name="baz"> <c name="baz"/> </b> <b name="apple"> <c name="apple"/> </b> <b name="pear"> <c name="pear"/> </b> <b name="cheese"> <c name="cheese"/> </b> </a>
The Twig code I've written works, and is something like this
#### HAND EDITED my $t = XML::Twig->new( twig_handlers => { b => \&b, }, pretty_print => 'indented', ); $t->parsefile($xml_file); $t->flush; sub b { my ( $t, $x ) = @_; my @c = $x->children('c'); my %bs; foreach my $c (@c) { my $text = $c->att('name'); $c->cut; push ( @{ $bs{$text} }, $c ); } foreach my $text ( keys %bs ) { my $b = $x->insert_new_elt( 'after', 'b', { %{ $x->atts } } ); $b->set_att( 'name' => $text ); foreach ( @{ $bs{$text} } ) { $adg->insert_new_elt( 'first_child', 'c', { %{ $_->atts } } ); } } $x->delete; } }
My questions: Thanks!

rkg

PS When I said "hand edited" above, I mean I took working code and working XML and simplified them for this post -- possible a typo crept in during the simplification for the post. But the original works and changes the original XML appropriately.

Replies are listed 'Best First'.
Re: XML:Twig -- changing in-mem process to stream
by mirod (Canon) on Oct 15, 2003 at 11:48 UTC

    Wahouh! You sure do things the hard way!

    Below is a version that loads only one b at a time.

    To send the XML to a file just pass a filehandle ref to flush or print: $t->flush( \*FILE).

    #!/usr/bin/perl -w use strict; use XML::Twig; my $t = XML::Twig->new( twig_handlers => { b => \&b, }, pretty_print => 'indented', ); $t->parse( \*DATA); $t->flush; sub b { my ( $t, $b ) = @_; foreach my $c ($b->children('c')) { # yep, that does it: wrap a b element around the c $c->wrap_in( b => { name => $c->att( 'name') } ) } $b->erase; # remove the original b $t->flush; # you need to flush here if you want to free the memory } __DATA__ <a> <b name="funny words"> <c name="foo"/> <c name="baz"/> </b> <b name="foods"> <c name="apple"/> <c name="pear"/> <c name="cheese"/> </b> </a>
Re: XML:Twig -- changing in-mem process to stream
by rkg (Hermit) on Oct 15, 2003 at 09:49 UTC
    Update: When I said "the DTD is complex", I meant the tags have many attributes not shown here. The structure is really simple and is as above: one "A" element, containing 1+ "B" elems, each with 1+ "C" elems, each "C" elem with no children. No content anywhere, just tags.