Ok, I've been doing, network and database programming in perl for years now. I had a brief run in with HTML::Parser at some point in my career and it took me a _long_ time to grasp it. Now, I need to basically write a parser that follows XLinks and returns all the content in those documents to a parent document, essentially treating the XLinks as XIncludes. So, it seems simple and I'm 100% sure it is, I'm just not getting the XML Parsing modules. However, this is the 3rd different way I've tried this, and the 3rd time I've gotten this exact error mesasge from all three methods.

First I tried to use XML::SAX and write filters in a similar fashion to the way I read an article on xml.com to follow XIncludes. I get this error message:
syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl +/5.6.1/i386-linux/XML/Parser.pm line 185

so then I figure, "hey, I did it wrong, and I don't understand" So I search around some more, and find another article on xml.com about how to filter using XML::SAX::Machines. So I rewrite an implementation of my parser using XML::SAX::Machines, and alas the SAME error message.

So I spend all day debuging and get no where. I admit that I'm making things more complicated in attempting to understand the parser routines, and I thought I had a grasp of how they atleast functioned to gather information out of an xml document. I reread everything and attempt another implementation using XML::Parser.

The code follows ...
#!/usr/bin/perl use strict; use warnings; use XML::Parser; use LWP::Simple; our %XLINK; my $parser = new XML::Parser( Handlers => { Start => \&handle_start, End => \&handle_end } ); $parser->parse('test.xml'); sub handle_start { my $expat = shift; my $element = shift; my %attrs = @_; foreach my $attr (keys %attrs) { my ($ns,$elm) = split /\:/, $attr, 2; next unless $ns =~ /xmlns/i; if($attrs{$attr} eq 'http://www.w3.org/1999/xlink') { $XLINK{label} = $elm; $XLINK{element} = $element; last; } } my %link = (); if($XLINK{label}) { foreach my $attr (grep /$XLINK{label}/, keys %attrs) { my ($ns,$a) = split /\:/,$attr,2; next unless $ns eq $XLINK{label}; $link{lc $a} = $attrs{$a}; } if(exists $link{href} && $link{type} eq 'simple') { print retrieve($link{href}); } } } sub handle_end { my $expat = shift; my $element = shift; %XLINK = () if $element eq $XLINK{element}; } sub retrieve { my $url = shift; return get($url); }

here is test.xml:
<?xml version='1.0'?> <test> <remote xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:title="testing this thing" xlink:href="http://divisionbyzero.net/~brad/xml/test.xml"> Testing this thing </remote> <local> <cat name="chunky"> <kitten>funky</kitten> <kitten>monkey</kitten> </cat> </local> </test>

I get the same error as before, and I was wondering if some one could potentially correct my thinking on this simple example that it might shine some light on my dismal XML::Parser comprehension.

much obliged,

-brad..

In reply to XML::Parser question. by reyjrar

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.