in reply to XML::Twig outputting root element start tag twice

First, you're missing the end-flush. Before you're really "DONE", you need to add $p->flush(). That gets the extra </foo> tag you're missing in your output. Not that you want it, but once you fix the other problem, you'll want it back.

Second, it's the twig_print_outside_roots flag that's doing it. Remove that. Instead, change your flush calls (including the new one) to have the param "$outfh". Now you'll flush to that file.

That leaves me with:

#!/usr/bin/perl use strict; use warnings; use XML::Twig; open my $outfh, '>', "out.xml" or die ">out.xml:$!"; my $p = XML::Twig->new( #twig_print_outside_roots => $outfh, twig_roots => { record => sub { $_->set_text("altered ".$_->text); shift->flush($outfh), } }, empty_tags => 'html', keep_encoding => 1, keep_spaces => 1, ); $p->parsefile("in.xml"); $p->flush($outfh); print "DONE\n";
As to why, ... I'm not sure.

Hope that helps,

Update: Ok, I see you really want the twig_print_outside_roots feature. It doesn't seem to do what you want it to, though. I am curious, though, as to why the formatting matters - this is XML, after all...

Replies are listed 'Best First'.
Re: XML::Twig outputting root element start tag twice
by benizi (Hermit) on Apr 18, 2006 at 19:06 UTC

    Explicitly adding the $outfh is part of what I was avoiding, as it's not in the scope of the actual handlers in the real-life example. (XML::Twig has so much DWIMmery, I assumed specifying an output filehandle would be something pretty trivial.)

    As to the formatting, it's because, while I'm using XML::Twig, other people in the project aren't (yet!), and the line-based -ness of the format is easier for them to handle. (Plus, I simply prefer the aesthetics of it.)

      Why is the output filehandle not in scope? If it is available when you create the twig, you should be able to use it in the handlers (you can use a closure to pass it to the handlers). You could also use select to send all output to the filehandle, even though I would consider not so good for the maintenability of the code.

      Finally, if you are using the latest version of XML::Twig, you don't need the final flush, it's done automagically.

        Hey, sorry for the slow response. I left for a vacation shortly after I posted my question. The filehandle is not in scope because the handlers are defined in a module and passed to the twig like this: (paraphrased)

        ... use Handlers::Library qw/:handlers/; ... open my $filehandle, '>', 'output.xml' or die ">output.xml:$!"; my $p = XML::Twig->new( twig_handlers => { foo => \&foo_handler }, );

        I'm going to end up doing what I was trying to avoid in the first place: passing a filehandle to all of the ->flush calls. When I asked the original question, I figured this was something XML::Twig would naturally support. (It's wonderfully DWIMmy, and does something similar for defaulting a filehandle with twig_print_outside_roots => $filehandle, but I didn't realize that that was just a select.)

        So, long story short, my final code will be something like: (untested)

        package Handlers::Library; my $twig_fh; BEGIN { $twig_fh = *STDOUT; } sub twig_output { my $name = shift; open $twig_fh, '>', $name or die ">$name: $!"; } sub foo_handler { ... $twig->flush($twig_fh) } sub bar_handler { ... $twig->flush($twig_fh) }
        use Handlers::Library qw/:handlers/; twig_output('output.xml'); my $p = XML::Twig->new(...);