comment on

I tried this with your problem area:

    my $summary; # = $quake->get( "summary");
#   print $summary;
      while ($summary = $quake->get("summary")) {
        $parser->parse_chunk($summary);
      }
[download]

and got this error message: junk after document element at line 1, column 601, byte 601 at C:/strawberry/perl/site/lib/XML/Rules.pm line 933. That is in XML::Rules->sub _parse_or_filter_chunk, and for me it raises the question of whether you need to read chunks at all. The error message I got is from an eval calling parse_more($string). A few line up, near the beginning of the routine, is a line reading croak "This parser is already busy parsing a full document!"

So the question is have you read in the whole document, and if so is there another method, say parse that should be used instead of parse_chunk?

UPDATE

I tried it again, this way using parse:


    my $summary = $quake->get( "summary");
    print $summary;
     #while ($summary = $quake->get("summary")) {
        $parser->parse($summary);
     #}
     #my $data = $parser->last_chunk();
     #my $dd = $data->get( "dd");
     #print $dd, "\n";
[download]

with the result:

C:\Users\JKeys>perl \myperl\quake.pl
# This Quake file created by quake_parsing_9
# Matt Coblentz; Perl version unknown
# For more information, see the USGS website
# Last Updated: 1 17 5 2013, 4:34:55
#

junk after document element at line 1, column 601, byte 601 at C:/stra
+wberry/perl/site/lib/XML/Rules.pm line 745.
<p class="quicksummary"><a href="http://earthquake.usgs.gov/earthquake
+s/eventpage/usc000hsdj#pager" title="PAGER estimated impact alert lev
+el" class="pager-gree
n">PAGER - <strong class="roman">GREEN</strong></a> <a href="http://ea
+rthquake.usgs.gov/earthquakes/eventpage/usc000hsdj#shakemap" title="S
+hakeMap maximum estim
ated intensity" class="mmi-V">ShakeMap - <strong class="roman">V</stro
+ng></a> <a href="http://earthquake.usgs.gov/earthquakes/eventpage/usc
+000hsdj#dyfi" class="
mmi-IV" title="Did You Feel It? maximum reported intensity (5 reports)
+">DYFI? - <strong class="roman">IV</strong></a></p><dl><dt>Time</dt><
+dd>2013-06-16 21:39:0
9 UTC</dd><dd>2013-06-16 23:39:09 +02:00 at epicenter</dd><dt>Location
+</dt><dd>34.491&deg;N 25.087&deg;E</dd><dt>Depth</dt><dd>37.85 km (23
+.52 mi)</dd></dl>
[download]

So I'm still getting the "junk" message, this time from the parse method. Don't know if that's the feed, your code, or my tweaks. But it's sleepy time now

In reply to Re: XML parsing with XML::Rules by jakeease
in thread XML parsing with XML::Rules by mcoblentz

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.