Hey guru's,

Long time Perl user, first time XML::LibXML::Reader user, and I've been banging my head against it for a number of days now.

The info on the module page isn't that great, but I've learned a lot through searching the monastery.

I have been able to get most of the data out of the following data, but am stuck on one field.

Here is an example of the XML data:

<FIXML r="20030618" s="20040109" v="4.4" xr="FIA" xv="1"> <Batch> <MktDataFull RptID="13793742" BizDt="2011-12-23"> <Instrmt Sym="MID" MMY="20120317"/> <Full Typ="5" Px="5.303128"/> <Full Typ="D" Px="884.91"/> </MktDataFull> <MktDataFull RptID="14536119" BizDt="2011-12-23"> <Instrmt Sym="MID" MMY="20120218"/> <Full Typ="5" Px="214.007661"/> <Full Typ="D" Px="884.91"/> </MktDataFull> </Batch> </FIXML>

I have been able to get all data, except for each RptID and BizDt (seems to only capture the first RptID records value, and not the subsequent ones) with the following code:

my $reader = new XML::LibXML::Reader(location => "$XMLfile") or die "c +annot read $XMLfile\n"; while ( $reader->nextElement( 'MktDataFull' )) { my $RptID = $reader->getAttribute('RptID'); my $BizDt = $reader->getAttribute('BizDt'); $reader->read; while ( $reader->nextElement( 'Instrmt' )) { my $Sym = $reader->getAttribute('Sym'); my $MMY = $reader->getAttribute('MMY'); $reader->read; while (1) { if ($reader->localName eq 'Full') { $Typ = <br>$reader->getAttribute('Typ'); } $reader->nextSibling() > 0 or last; } $fileLine = $RptID . "," . $BizDt . "," . $Sym . "," . $MMY . +"," . $MatDt . "," . $CFI_recType . "," . $Typ4Dt; print CSVOUT "$fileLine\n"; } $reader->nextSibling() > 0 or last; }

For some reason it's only reading the first "MktDataFull" records attributes (RptID & BizDt), and not any of the others (the are .5 million records per file).

What am I doing wrong? Any advice will be greatly appreciated.

On a side note, the file I'm reading has a half a million records in it, and is 150MB in size. Is there any other form of XML parser that would be easier and/or faster for this task?

Once again, any advice will be greatly appreciated guru's.


In reply to Issue with looping through XML::LibXML::Reader by ozguy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.