comment on

I’d advise against XML::Parser at this point for two reasons – it’s a wrapper around the rather old (if trusty) expat library, and its API is rather hard to program for – because back when expat was written, XML was still in a bit of a flux.

For processing XML documents, you want to learn about XPath. A pithy description of what it is might be “a pattern match language for trees.” I lets you specify which portion of a document you’re interested in very concisely. Knowing XPath is the difference between XML being a chore or a charm.

XML::Twig does make things much easier, but when I last dealt with it it did not offer real XPath support and worked pretty heavily on the Perl side of things. That means large documents are slow to process and can consumed a lot of memory. The memory hunger can be controlled if you pay careful attention and your use case lends itself to processing the document chunk-wise, but that takes effort.

I’d instead suggest XML::LibXML. It’s a wrapper around the newer, more compliant libxml2 library which offers the nicer sorts of APIs that were designed after XML was finished – its XPath support is excellent. And since its internal data structures all reside on the C side, it can handle much larger documents than the (more) pure-Perl modules without any effort on the programmer’s part. It’s also much faster than such modules for the same reason.

I use it for all of my XML needs these days am an absolutely satisfied customer.

Makeshifts last the longest.

In reply to Re: Using the XML::Parser Module by Aristotle
in thread Using the XML::Parser Module by raina

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.