bryank has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am trying to use a script that uses XML::Twig to modify specific tags in xml docs(I got the suggestion to use XML::Twig from another monk btw). The script does what I want, but in addition to updating the xml files, it also spits out a bunch of output as a single line:

<merchant>BOB</merchant><merchant>Jane</merchant></testProduct><mercha +nt>BOB</merchant><merchant>Jane</merchant></testProduct><merchant>BOB +</merchant><merchant>Jane</merchant></testProduct><merchant>BOB</merc +hant><merchant>Jane</merchant></testProduct><merchant>BOB</merchant>< +merchant>Jane</merchant></testProduct><merchant>BOB</merchant><mercha +nt>Jane</merchant></testProduct><merchant>BOB</merchant><merchant>Jan +e</merchant></testProduct><merchant>BOB</merchant><merchant>Jane</mer +chant></testProduct>
I'm not sure why that is -- I thought the script was writing directly to the xml files I was editing? Anyway, if there are any tips on modifying my code to remove the stdout output, or tips on prettifying the data, I'd appreciate it. Here is my script:

#!/usr/bin/perl use strict; use warnings; use File::Find; use XML::Twig; @ARGV = ('.') unless @ARGV; my $dir = shift @ARGV; find(\&edits, $dir); sub edits() { my $seen = 0; my $file = $_; if ($file eq 'data.xml') { my $orig = 'data.xml'; use autodie 1.999; use POSIX 'strftime'; my $back = $orig . strftime( '-%Y-%m-%d', localtime ); rename $orig, $back; open my $newfh, '>', $orig; my $t = XML::Twig->new( twig_roots => { 'merchant' => sub { my ( $t, $value ) = @_; { my $ra = $value->text; $ra =~ s/_/ /g; $value->set_text($ra); } $value->print($newfh); }, }, twig_print_outside_roots => $newfh, ); $t->parsefile($back); $t->flush; #don't forget undef $t; # close $newfh; } }

Replies are listed 'Best First'.
Re: Trying to disable/edit stdout output with XML::Twig
by kennethk (Abbot) on Jun 23, 2009 at 00:06 UTC
    From XML::Twig documentation:

    Processing an XML document chunk by chunk

    One of the strengths of XML::Twig is that it let you work with files that do not fit in memory (BTW storing an XML document in memory as a tree is quite memory-expensive, the expansion factor being often around 10).

    To do this you can define handlers, that will be called once a specific element has been completely parsed. In these handlers you can access the element and process it as you see fit, using the navigation and the cut-n-paste methods, plus lots of convenient ones like prefix . Once the element is completely processed you can then flush it, which will output it and free the memory. You can also purge it if you don't need to output it (if you are just extracting some data from the document for example). The handler will be called again once the next relevant element has been parsed.

    so, short answer, replace $t->flush; with $t->purge;.

      Thanks for the help -- I appreciate it.