Mr.Churka has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I've been getting a lot of use out of XML Twig, but have been sticking primarily to calling handlers based on element tags. Now I have a block of XML where the naming conventions caused duplicate tags for distinct elements. After poking around the CPAN docs for both twig and xpath, the XML::Twig website, and the Perl monks archives, I still can't figure out the appropriate syntax for calling a handler on an attribute/value pairing. Here's a snippet of the XML I'm dealing with.
<us:ItemMaster> <us:ItemMasterHeader> <oa:ItemID agencyRole="Product_Number"> <oa:ID>0123456</oa:ID> </oa:ItemID> <oa:ItemID agencyRole="Prefix_Number"> <oa:ID>AAA</oa:ID> </oa:ItemID> <oa:ItemID agencyRole="Stock_Number_Butted"> <oa:ID>01234</oa:ID> </oa:ItemID> </us:ItemMasterHeader> </us:ItemMaster>
I can't simply call a handler on the ItemID tag because of the duplication. The number I'm looking for is a concatenation of the prefix and stock number. Here's what I'm trying.
use strict; use warnings; use xml::twig; use utf8; my $file= "itemsANDstuff.xml"; my $twig_handlers = {oa:ItemID[my @agencyRole='Prefix_Number']=> \&ITE +Mnumber}; my $twig= new XML::Twig(TwigRoots => {'us:ItemMasterHeader' => 1}, + TwigHandlers => $twig_handlers); $twig->parsefile($file); sub ITEMnumber{ my ($twig, $Item)= @_; my $prefix=($Item->first_child->text); my $stock = (next_sibling->first_child->text); my $itemNUMBER = $prefix.$stock; print "$itemNUMBER \n";};
The line raising an exception is my $twig_handlers = {oa:ItemID[my @agencyRole='Prefix_Number']=> \&ITEMnumber}; I've tried a variety of fixes on this line, including using * between the element tag and the [, escaping the @ using double quotes and single quotes, single quotes and double quotes on the element tag alone, omitting the element tag in favor of an *, praying and kicking my machine. None of these have helped much. Any help would be greatly appreciated.

Replies are listed 'Best First'.
Re: XML Twig Handler Triggers Syntax
by Jenda (Abbot) on Dec 26, 2007 at 16:58 UTC

    Erm ... why don't you test the value of the attribute within the ITEMnumber and return if it doesn't match whatever you were looking for?

    I don't know XPath myself, but I would not expect to see "my" within any or having them unquoted within a Perl script. Try something like

    my $twig_handlers = { q{oa:ItemID[@agencyRole='Prefix_Number']} => \&ITEMnumber };

    Update: Your solution (if it worked) as well as the first runrig's solution has one problem, it's very picky. It would break as soon as the order of the <oa:ItemID>s changed or (your's only) even if there was just another inserted in between. If that's OK with you, OK. I would feel uneasy about that. The second runrig's solution does not have this problem, but it starts to look too complex. Especially due to the use of the lexical variables shared by the handlers.

    I think this can be better solved by XML::Rules ... the code does basicaly the same, but it doesn't have to worry about keeping the data:

    use strict; use XML::Rules; my $parser = XML::Rules->new( start_rules => { 'oa:ItemID' => sub {return $_[1]->{agencyRole} =~ /^(?:Prefix_Numb +er|Stock_Number_Butted)$/} # this filters the <oa::ItemID>s we are interested in }, rules => { 'oa:ID' => 'content', 'oa:ItemID' => sub { return $_[1]->{agencyRole} => $_[1]->{'oa:ID'} }, 'us:ItemMasterHeader' => sub { print "$_[1]->{Prefix_Number}$_[1]->{Stock_Number_Butted} \n"; return; } }, ); $parser->parse(\*DATA); __DATA__ <us:ItemMaster> <us:ItemMasterHeader> <oa:ItemID agencyRole="Product_Number"> <oa:ID>0123456</oa:ID> </oa:ItemID> <oa:ItemID agencyRole="Prefix_Number"> <oa:ID>AAA</oa:ID> </oa:ItemID> <oa:ItemID agencyRole="Stock_Number_Butted"> <oa:ID>01234</oa:ID> </oa:ItemID> </us:ItemMasterHeader> </us:ItemMaster>
      The Syntax that fixed the issue was  my $twig= new XML::Twig(TwigRoots => {'oa:ItemID[@agencyRole="Prefix_Number"]' => \&STOCKnumber, The code you provided produces a warning while using strict that an explicit declaration of package name is required for @agencyrole. I am still unable to extract the text of the child element though. Now Perl is claiming that my $Item is undefined.
Re: XML Twig Handler Triggers Syntax
by runrig (Abbot) on Dec 26, 2007 at 18:49 UTC
    This seems to work:
    my $t = XML::Twig->new( twig_handlers => { 'oa:ItemID[@agencyRole="Prefix_Number"]' => \&set_prefix_number, 'oa:ItemID[@agencyRole="Stock_Number_Butted"]' => \&get_stock_numb +er, }, ); $t->parse($xml); BEGIN { my $prefix_number; sub set_prefix_number { my ($t, $node) = @_; $prefix_number = $node->text(); } sub get_stock_number { my ($t, $node) = @_; my $stock_number = $node->text(); print "$prefix_number$stock_number\n"; } }
    Another approach:
    my $t = XML::Twig->new( start_tag_handlers => { 'oa:ItemID' => \&set_agency_role, }, twig_handlers => { 'us:ItemMasterHeader => \&process_item, 'oa:ID' => \&item_id, }, ); $t->parse($xml); BEGIN { my $agency_role; my %id; sub set_agency_role { my ($t, $node) = @_; $agency_role = $node->att('agencyRole'); } sub item_id { my ($t, $node) = @_; $id{$agency_role} = $node->text(); } sub process_item { print "$id{Prefix_Number}$id{Stock_Number_Butted}\n"; } }
      These both function. In my initial code I typed
      my $twig_handlers =...
      instead of
      my $twig_handlers => ...
      I wouldn't have caught the code error without comparing it to yours. Thanks!