You are correct. I cut and pasted and then entered the populate sub. It is my understanding that twig sets up handlers that are called for each element in the xml when you go to parse the file. The XML I'm dealing with is structured in a highly peculiar way.
There's a brief header with information that is irrelevant to What I'm using the data for. All remaining data is under a parent titled "Data." Under that parent are roughly 500 children, each of which is a product with roughly 300 properties setup as children of it's own. The problem for me is that those properties aren't uniform. 60 items may have a listing for "number of pages" while others will have "number of tracks." Each item is massive, so here's a brief snippet.
<is:ItemMaster>
<is:ItemMasterHeader>
<oi:ItemID agencyRole="Product_Number">some_number</oi:ItemID>
<oi:ItemID agencyRole="Prefix_Number">some_number</oi:ItemID>
<oi:ItemID agencyRole="Stock_Number">some_number</oi:ItemID>
<oi:ManufacturerItemID>some_manufacturer_ID</oi:ManufacturerID>
<is:Classification type="Group"></is:Classification>
<is:Classification></is:Classification>
Each of these ItemMasters has around eight children and the children have anywhere from one to twenty-four children. Because the children are not uniform this is giving me headaches.
Here's my first revision
#!/bin/perl
use XML::Twig;
%Items=();
my $twig=XML::Twig->new(
twig_handlers =>
{populate=> sub { while (<>)
{ if (%Items !~ m/"<us:"|"<oa:"(.*)/) { $Items{$1} =1}
else {$Items{$2} =($Items{$1}+(/$1/)) }
}; #If element is not in the hash, adds it
}, #If element is in the hash, adds the number of matches
div => sub { $_[0]->purge; }, # free memory
},
);
$twig->parsefile( '500syncItemMaster.xml'); # build it
$twig->purge; # clear end of document from memory
print %Items; # output the twig
Now when I print I get nothing. I tried a test run and it seems like the handlers are not getting called at all.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.