Good morning!

I am using XML::Twig to conditionally filter out elements in an XML file and then conditionally "duplex" the output to two different output files. I have managed to jury-rig something that gives me the correct output, but I imagine there is a better, more correct way to accomplish the task that does not involve reprocessing the input file multiple times.

In my sample program below, I am splitting <thing> elements with type attributes of "vegetable" and "fruit" off into separate files. <thing> elements with a "city" attribute are filtered out and deleted. The <header> and <footer> elements are duplexed to both output files. Is there a way to conditionally split target elements off into separate files and duplicate elements "outside" the target element to separate files without having to read the input file multiple times?

#!/usr/bin/perl use 5.018; use strict; use warnings; use XML::Twig; { my $t; my $pos = tell 'DATA'; # save the offset... # Process fruit open (my $FRUIT, '>', './fruit.xml') or die "./fruit.xml:\n$!\n$^E"; $t = XML::Twig->new( twig_handlers => { 'thing' => sub { _filter(@_, 'fruit', $FRUIT); 1; }, 'thing//*' => sub { 1; }, '_default_' => sub { $_[0]->flush($FRUIT); 1; }, '#CDATA' => sub { 1; }, }, pretty_print => 'indented', comments => 'drop', # remove any comments empty_tags => 'normal',# empty tags = <tag/> ); $t->parse(*DATA); close $FRUIT; seek 'DATA', $pos, 0; # reset DATA for the second run-through # Process vegetables open (my $VEG, '>', './veg.xml') or die "./veg.xml:\n$!\n$^E"; $t = XML::Twig->new( twig_handlers => { 'thing' => sub { _filter(@_, 'vegetable', $VEG); 1; }, 'thing//*' => sub { 1; }, '_default_' => sub { $_[0]->flush($VEG); 1; }, '#CDATA' => sub { 1; }, }, pretty_print => 'indented', comments => 'drop', # remove any comments empty_tags => 'normal',# empty tags = <tag/> ); $t->parse(*DATA); close $VEG; } sub _filter { my ($_twig, $thing_element, $keep_me, $PRINT_FILE) = @_; # Flush the twig to file if the 'type' attribute matches... if ( $thing_element->{att}{type} eq $keep_me ) { $_twig->flush($PRINT_FILE); } # ... otherwise delete the twig else { $thing_element->delete(); } return 1; } __DATA__ <batch> <header> <foo>1</foo> <bar>2</bar> <baz>3</baz> </header> <thing type="fruit" >Im an apple!</thing> <thing type="city" >Toronto</thing> <thing type="vegetable" >Im a carrot!</thing> <thing type="city" >Melrose</thing> <thing type="vegetable" >Im a potato!</thing> <thing type="fruit" >Im a pear!</thing> <thing type="vegetable" >Im a pickle!</thing> <thing type="city" >Patna</thing> <thing type="fruit" >Im a banana!</thing> <thing type="vegetable" >Im an eggplant!</thing> <thing type="city" >Taumatawhakatangihangakoauauotamateaturipuk +akapikimaungahoronukupokaiwhenuakitanatahu</thing> <trailer> <chrzaszcz>A</chrzaszcz> <zdzblo>B</zdzblo> </trailer> </batch>

Thank you for your time.

perl -v This is perl 5, version 18, subversion 2 (v5.18.2) built for MSWin32-x +64-multi-thread (with 1 registered patch, see perl -V for more detail)
perl -MXML::Twig -E "say $XML::Twig::VERSION;" 3.48

In reply to [SOLVED] XML::Twig - Filtering and duplexing output to multiple output files by ateague

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.