Re: XML::Parser and multiple results
by leriksen (Curate) on Feb 19, 2004 at 01:28 UTC
|
try wrapping them in a <results> tag, and either add id attributes or remove the numbers
<results>
<output>
<program></program>
<version></version>
<information></information>
<data></data>
</output>
<output id="2">
...
</output>
</results>
Your only allowed one root element in XML, IIRC.
+++++++++++++++++
#!/usr/bin/perl
use warnings;use strict;use brain;
| [reply] [d/l] |
Re: XML::Parser and multiple results
by borisz (Canon) on Feb 19, 2004 at 01:08 UTC
|
Maybe I get it wrong, but if you have
<output>
...
</output>
<output 2>
</output 2>
that is not xml. what you can try is transform your illformed xml to valid xml and parse again.
<output>
...
</output>
<output>
</output>
this whould be fine. Or if you need the number
<output>
...
</output>
<output id="2">
</output>
| [reply] [d/l] [select] |
Re: XML::Parser and multiple results
by kvale (Monsignor) on Feb 19, 2004 at 01:12 UTC
|
Names can contain letters, numbers, and other characters
Names must not start with a number or punctuation character
Names must not start with the letters xml (or XML or Xml ..)
Names cannot contain spaces
So it would be best to fix the XML first.
| [reply] [d/l] [select] |
|
I was giving example elements and not real elements. The XML is correctly formed other than the fact that there are multiple results in the same file. Each Output on its own is a valid file. I don't really want to pollute my filesystem with 46k -> 100k individual files that are clunky to move around, so I'm trying to figure out if there's a way to parse one file instead of splitting each output into its own file. From what I can see from the responses, the only way would be to have perl split the big file into a hash or array and feed that into XML::Parser as individual "files"... Does this seem to be correct?
Ineff
| [reply] |
|
| [reply] |
|
|
| [reply] |
|
From what I can see from the responses, the only way would be to have perl split the big file into a hash or array and feed that into XML::Parser as individual "files"... Does this seem to be correct?
No, it is perfectly valid to have any number of tags there. If your output is valid. But XML::Parser stops, so I doubt it is valid.
| [reply] [d/l] |
Re: XML::Parser and multiple results
by mirod (Canon) on Feb 19, 2004 at 08:39 UTC
|
An XML document can only have 1 (one) root. Hence your XML is not valid.
Fortunately XML::Parser has a Stream_Delimiter option:
* Stream_Delimiter
This is an Expat option. It takes a string value. When this
string is found alone on a line while parsing from a stream,
then the parse is ended as if it saw an end of file. The
intended use is with a stream of xml documents in a MIME multi‐
part format. The string should not contain a trailing newline.
| [reply] |
|
I'm not really sure what this could do.. If I put a stream delimiter at the end of my really big file, will it parse all the way through it? If so, could I use that to load the entire file into a really big hash? Thanks. Ineff
| [reply] |
|
You don't have 1 XML document, as it is you have a number of them, in a single file. You coult insert stream delimiters between those documents to get XML::Parser to unserstand they they should be treated as separate XML documents.
You could also wrap them all in a root tag, either in the main file, or by using an entity that includes it, see Re: XML log files.
Oh, and I found a FAQ about it in the XML::Twig FAQ :--)
You really have to understand that at this point you don't have XML. If it doesn't parse, then it is not XML. If you want to have XML you have to get your data to be XML, or to use the mildly hacky stream delimiter option.
| [reply] |