Re: xml parsing without using cpan modules
by edan (Curate) on Aug 10, 2004 at 11:53 UTC
|
XML can be parsed by writing an XML parser, of course. You'll need to read and understand the XML W3C Recommendation. As you'll probably find out quite soon, implementing this is not an easy task. That is why people like to use the well-tested and trusted modules that have done all this hard work already. Nothing is stopping you from re-inventing this rather large wheel, though.
| [reply] |
Re: xml parsing without using cpan modules
by gellyfish (Monsignor) on Aug 10, 2004 at 12:01 UTC
|
Sure it can, however I think if you take a look at this first you will realize that it's one of those "If you have to ask then you shouldn't try it" situations. /J\
| [reply] |
Re: xml parsing without using cpan modules
by graff (Chancellor) on Aug 10, 2004 at 13:15 UTC
|
You could try downloading one or more of the relevant module tar files from CPAN, and look at the source code to see how they do it.
You'll find that some (probably most) of them use a separate library package called "expat", which is written in C and is not part of CPAN (it is a toolkit for XML that is used in many non-Perl applications, as well as in Perl modules). The fact that the Perl XML parser modules depend on this "outside" resource causes pain for some people when they try to install these modules. But once you find the expat package, installing it is simple, and then installing the perl XML parser modules is also simple.
At the risk of leading you down a dangerous path, I would also point out that "XML data" can cover a broad range in terms of ease vs. difficulty for parsing. As a markup language, it can express very complex data structures as well as very simple ones.
If you need to process a set of XML data that you know has very simple structure, uses only a very limited range of XML expressions, and is fully predictable in terms of how the tags are rendered, then you could write a script that uses just regular expressions or string comparisons to handle such data, and not use a module.
But when you become familiar with the use of the XML modules, you will find that they typically allow you to write perl scripts that are shorter and easier to maintain and adapt. And of course, if and when you are faced with some really complex XML data, the appropriate XML module will be a life-saver. | [reply] |
•Re: xml parsing without using cpan modules
by merlyn (Sage) on Aug 10, 2004 at 13:40 UTC
|
| [reply] |
Re: xml parsing without using cpan modules
by tomhukins (Curate) on Aug 10, 2004 at 14:14 UTC
|
| [reply] |
|
|
And more importantly, he knows the XML generators whose output he is dealing with, which means he doesn't have to account for all plausible cases — only those the generators he is dealing with take advantage of.
If you want to deal with XML in the general case, then you do have to parse, no way around it.
Makeshifts last the longest.
| [reply] |
|
|
| [reply] |
|
|
|
|
|
|
|
|
|
I've done it as well, but I've been using Perl to parse HTML since 1994 and XML for four years. In those cases, I also have the file creator within spitting distance, "and he was a poor spitter, lacking both distance and control"(*), so I could literally beat any them over the head if I wanted to. If I didn't have a lot of experience I'd never do it, and if the file provider isn't within strangling distance, I go with CPAN modules.
In short, you can do it, but you probably shouldn't do it.
(*) - P.G. Wodehouse, Money in the Bank
--
tbone1, YAPS (Yet Another Perl Schlub)
And remember, if he succeeds, so what.
- Chick McGee
| [reply] |
Re: xml parsing without using cpan modules
by Scarborough (Hermit) on Aug 10, 2004 at 11:42 UTC
|
I have done this type of thing in the past for very simple data extraction, the method being;
1.Read the file line by line.
2.Use regex to find tags.
3.Use tags to be the keys and the values to be the values.
But XML::Simple is much better as it is so much more powerful, doing things like integrity checks etc. To replicate all its features would be a sizeable piece of work.
| [reply] |
A reply falls below the community's threshold of quality. You may see it by logging in. |
Re: xml parsing without using cpan modules
by gmpassos (Priest) on Aug 11, 2004 at 05:24 UTC
|
"xml parsing without using cpan modules"
Humm, don't do that! XML syntax is not soo simple!
If you wan't a pure Perl parser you can take a look into XML::Smart::Parser, that works similar to XML::Parser. Than plug your app to it and done, or just use XML::Smart ;-P.
Graciliano M. P.
"Creativity is the expression of the liberty".
| [reply] |