XML::Parser and multiple results

Ineffectual has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: XML::Parser and multiple results by leriksen (Curate) on Feb 19, 2004 at 01:28 UTC
try wrapping them in a <results> tag, and either add id attributes or remove the numbers `<results> <output> <program></program> <version></version> <information></information> <data></data> </output> <output id="2"> ... </output> </results>` [download] Your only allowed one root element in XML, IIRC. +++++++++++++++++ #!/usr/bin/perl use warnings;use strict;use brain;	[reply] [d/l]
Re: XML::Parser and multiple results by borisz (Canon) on Feb 19, 2004 at 01:08 UTC
Maybe I get it wrong, but if you have `<output> ... </output> <output 2> </output 2>` [download] that is not xml. what you can try is transform your illformed xml to valid xml and parse again. `<output> ... </output> <output> </output>` [download] this whould be fine. Or if you need the number `<output> ... </output> <output id="2"> </output>` [download] Boris	[reply] [d/l] [select]
Re: XML::Parser and multiple results by kvale (Monsignor) on Feb 19, 2004 at 01:12 UTC
Hmm, I don't think `<output 2>` is a valid element in XML. Whitespace is used to separate the element from attributes and `2` is not a valid attribute. XML elements must follow these naming rules: `Names can contain letters, numbers, and other characters Names must not start with a number or punctuation character Names must not start with the letters xml (or XML or Xml ..) Names cannot contain spaces` [download] So it would be best to fix the XML first. -Mark	[reply] [d/l] [select]
Re: Re: XML::Parser and multiple results by Ineffectual (Scribe) on Feb 19, 2004 at 01:15 UTC
I was giving example elements and not real elements. The XML is correctly formed other than the fact that there are multiple results in the same file. Each Output on its own is a valid file. I don't really want to pollute my filesystem with 46k -> 100k individual files that are clunky to move around, so I'm trying to figure out if there's a way to parse one file instead of splitting each output into its own file. From what I can see from the responses, the only way would be to have perl split the big file into a hash or array and feed that into XML::Parser as individual "files"... Does this seem to be correct? Ineff	[reply]
Re: Re: Re: XML::Parser and multiple results by runrig (Abbot) on Feb 19, 2004 at 01:21 UTC
If that's the case then maybe all you need to do is wrap your file in one root node to make it valid XML. See Re: Parsing with SAX an XML Document with no Root Node for an example.	[reply]
Re: Re: Re: Re: XML::Parser and multiple results by Ineffectual (Scribe) on Feb 19, 2004 at 18:41 UTC
Re: Re: Re: XML::Parser and multiple results by chromatic (Archbishop) on Feb 19, 2004 at 01:21 UTC
Does this XML need to validate against a schema? If not, could you prepend an opening tag and append a closing tag, wrapping multiple nodes in one parent node?	[reply]
Re: Re: Re: XML::Parser and multiple results by borisz (Canon) on Feb 19, 2004 at 01:26 UTC
From what I can see from the responses, the only way would be to have perl split the big file into a hash or array and feed that into XML::Parser as individual "files"... Does this seem to be correct? No, it is perfectly valid to have any number of tags there. If your output is valid. But `XML::Parser` stops, so I doubt it is valid. Boris	[reply] [d/l]
Re: XML::Parser and multiple results by mirod (Canon) on Feb 19, 2004 at 08:39 UTC
An XML document can only have 1 (one) root. Hence your XML is not valid. Fortunately XML::Parser has a `Stream_Delimiter` option: * Stream_Delimiter This is an Expat option. It takes a string value. When this string is found alone on a line while parsing from a stream, then the parse is ended as if it saw an end of file. The intended use is with a stream of xml documents in a MIME multi‐ part format. The string should not contain a trailing newline.	[reply]
Re: Re: XML::Parser and multiple results by Ineffectual (Scribe) on Feb 19, 2004 at 19:58 UTC
I'm not really sure what this could do.. If I put a stream delimiter at the end of my really big file, will it parse all the way through it? If so, could I use that to load the entire file into a really big hash? Thanks. Ineff	[reply]
Re: Re: Re: XML::Parser and multiple results by mirod (Canon) on Feb 19, 2004 at 20:46 UTC
You don't have 1 XML document, as it is you have a number of them, in a single file. You coult insert stream delimiters between those documents to get XML::Parser to unserstand they they should be treated as separate XML documents. You could also wrap them all in a root tag, either in the main file, or by using an entity that includes it, see Re: XML log files. Oh, and I found a FAQ about it in the XML::Twig FAQ :--) You really have to understand that at this point you don't have XML. If it doesn't parse, then it is not XML. If you want to have XML you have to get your data to be XML, or to use the mildly hacky stream delimiter option.	[reply]


Syntactic Confectionery Delight
	PerlMonks