clinton has asked for the wisdom of the Perl Monks concerning the following question:
I receive XML files from my client with a list of orders. The XML (over which I have no control) may or may not contain some dodgy characters, which would cause XML parsing to fail.
However, I would like to process all the orders that I can, and report errors on the dodgy ones.
My understanding is that, if I use
then the file will be parsed quickly, but either succeed or fail in its entirety.$parser = XML::LibXML->new(); $doc = $parser->parse_file( $xmlfilename );
I was thinking about using this as the first method, for speed, and if it fails, resort to something like:
Or should I be creating one master document and importing/adopting nodes? Or a different approach entirely? thanks$p = XML:LibXML->new(); local $/='</order>'; open (FH,'<:utf8',$filename) or die $!; while (my $order = <FH>) { $order=~s/^.*?<order>/<order>/gs; my $xml = <<XML; <?xml version="1.0" encoding="UTF-8"?> <orders> $order </orders> XML my $doc = eval {$parser->parse_string($xml)}; if ($@) { warn ("error : $@"); next; } process_orders($doc); }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Parsing dodgy XML
by mirod (Canon) on Sep 20, 2006 at 14:54 UTC | |
by clinton (Priest) on Sep 20, 2006 at 15:17 UTC | |
|
Re: Parsing dodgy XML
by shmem (Chancellor) on Sep 20, 2006 at 14:55 UTC | |
|
Re: Parsing dodgy XML
by merlyn (Sage) on Sep 20, 2006 at 15:26 UTC | |
by clinton (Priest) on Sep 20, 2006 at 15:38 UTC | |
by merlyn (Sage) on Sep 20, 2006 at 15:42 UTC | |
by Anonymous Monk on Sep 20, 2006 at 19:46 UTC | |
by bart (Canon) on Sep 21, 2006 at 06:26 UTC | |
by codeacrobat (Chaplain) on Sep 20, 2006 at 21:19 UTC | |
|
OT: Parsing dodgy XML
by astroboy (Chaplain) on Sep 21, 2006 at 09:45 UTC |