gregor42 has asked for the wisdom of the Perl Monks concerning the following question:

Recently I wrote a piece of code to make human readable digest versions of XML logfiles that are generated from the Kana customer service email server.

Once I was done I started testing my code & voila I was bitten! By parsing XML by hand I forgot that parameters could come in any order & so my code broke when I passed in a logfile generated from Kana's email-a-log function, in which the parameters for a certain tag are given in a different order.

So... I set out to learn how to use XML::Parser & stop trying to reinvent the wheel... DUH!

It works! And (not so)incredibly it's almost 100% faster than the routine I wrote.

Now I have a different problem.

I find that when I call $p->parsefile() I'm getting the first line of the XML files printed to the screen. (Assumably to STDERR) For example:

<?xml:stylesheet type="text/xsl" href="log.xsl"?>

Now I assume that this is going to STDERR because I found that when I put in a die, parsefile would fail. However, when I comment it out, it continues to function properly...Like so:

my $p=new XML::Parser(Style=>'Stream') or die ("Couldn't make new Parser\n$GOODBYE\n"); $p->parsefile($logfile, ErrorContext => 0) ; #or die ("Couldn't invoke parsefile in XML::Parser...\n$GOODBY +E\n");

Why is it doing this & how do I handle it more gracefully?


I went so far as to modify StartTag to try to handle it, but that didn't do anything either...

sub StartTag { my ($expat,$eltype)=@_; if ($eltype eq "log") { $version = $_{version}; $datestamp = $_{date}; } elsif ($eltype eq "node") { } elsif ($eltype eq "component") { } elsif ($eltype eq "?xml:stylesheet") { } elsif ($eltype eq "m") { $p[i] = $_{p}; $component_type[$i] = $_{c}; $datetime[$i] = $_{d}; $u[$i] = $_{u}; $sequential_id[$i] = $_{i}; $node_id[$i] = $_{n}; $severity[$i] = $_{s}; $source[$i] = $_{t}; } else { die "invalid element: $eltype"; } }

So what am I doing wrong?



Wait! This isn't a Parachute, this is a Backpack!

Replies are listed 'Best First'.
Re: XML::Parser trouble
by mirod (Canon) on Apr 04, 2001 at 20:58 UTC

    Your problem is that <?xml:stylesheet type="text/xsl" href="log.xsl"?> is a processing instruction, not a start tag . You use the Stream style which by default, if no handler is set, output the input as-is.

    You need to write a subroutine named PI, which will be called as per XML::Parser doc: "The $_ variable will contain a copy of the PI and the target and data are sent as 2nd and 3rd parameters respectively. The target here is xml:stylesheet and the data is type="text/xsl" href="log.xsl".

      Excellent! Many thanks brother mirod..

      All that I needed to do was to add:

       sub PI{}

      which handles (ingores) the processing instruction.

      Many thanks again, and Boomshanka!



      Wait! This isn't a Parachute, this is a Backpack!
Re: XML::Parser trouble
by arturo (Vicar) on Apr 04, 2001 at 20:57 UTC

    If you want to see whether it's going to STDERR, one thing you can do on *nix is to pipe the contents of STDERR out to a file with perl xmlparserscript.pl 2> errors.txt. This should be of general use.

    But if you're getting an error, chances are the XML is not well-formed. I'm not *certain* of this, but the processing instruction for stylesheets is not xml:stylesheet but xml-stylesheet.

    Update I tried it with a little XML::Sablotron script (Sablotron is an XSLT processor which uses expat as a parser), and I got an error with <?xml:stylesheet ...> and not with <?xml-stylesheet ...>, so that's probably it.

    HTH!

    Philosophy can be made out of anything. Or less -- Jerry A. Fodor