bobf has asked for the wisdom of the Perl Monks concerning the following question:
I'm trying to parse an XML doc with XML::Twig XML::Simple*, which apparently is calling XML::SAX::PurePerl under the hood. At some point in the 18 MB XML file the parser chokes and throws an exception:
Invalid quote token [Ln: 1, Col: 14]
The problem is, I don't know what line in the file caused the error. I found the method that generates that output (I think) in XML::SAX::PurePerl and tried to redefine it to print the input data. I added the following code to the bottom of my script:
{ package XML::SAX::PurePerl; sub quote { my ($self, $reader) = @_; my $data = $reader->data; # Original # $data =~ /^(['"])/ or $self->parser_error("Invalid quote tok +en", $reader); # Modified $data =~ /^(['"])/ or do { $data =~ /^(.)/; my $quote_char = $1; warn "Invalid quote token found: -->$quote_char<--\n" . "Source line follows:\n" . $data; $self->parser_error("Invalid quote token", $reader); }; $reader->move_along(1); return $1; } }
Now, however, instead of displaying the warning message when the system hits the mystery line, I get a different error:
Can't locate object method "new" via package "XML::SAX::PurePerl" at C +:/Perl_5.8.8/site/lib/XML/SAX/ ParserFactory.pm line 43.
I thought I was using package appropriately, but I obviously changed more than I expected. I'm a bit rusty (haven't had much time to code lately) and I think I'm missing a fundamental concept. Whack away with the clue-stick, please.
Thanks
*Updated after additional debugging, per Re^2: Redefining a method in XML::SAX::PurePerl.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Redefining a method in XML::SAX::PurePerl
by Anonymous Monk on Jul 31, 2009 at 18:21 UTC | |
by bobf (Monsignor) on Jul 31, 2009 at 20:09 UTC | |
by Anonymous Monk on Jul 31, 2009 at 23:58 UTC | |
|
Re: Redefining a method in XML::SAX::PurePerl
by mirod (Canon) on Aug 01, 2009 at 06:35 UTC | |
by bobf (Monsignor) on Aug 02, 2009 at 01:54 UTC | |
by mirod (Canon) on Aug 03, 2009 at 08:41 UTC |