comment on

I must say I did not quite like this book. It is not that it is awful, it's just that it rubbed me the wrong way I guess.

I have objective complaints: some of the examples I have checked are not as robust as they should be, some (minor) facts are wrong, and most importantly some important points are not even discussed. I also have some more subjective problems with this book which overall made it a not-so-enjoyable read.

Overall the book can be useful as a source of commented code examples though, and it explains properly the various processing models available (tree vs event). I just think that it could really have used some more reviewing and editing.

Missed points

The book does not discuss encodings at all. This is a major problem, as in my experience most of the problems beginners have with XML come from misunderstanding encodings. If your data is either US-ASCII or UTF-8 and will be in the future, then encodings won't be too much of a problem. In real life this seems to be rarely the case, so I would expect a book about Perl and XML to give you an idea why your parser dies mysteriously when fed a French name and what to do in this case.

A whole chater is dedicated to modules that interface with XSLT processors but there is no discussion on why you woud want to use XSLT as opposed to Perl, and how to choose when to use one or the other (or use both in cooperation). This is the kind of high level introduction that I would have hoped to find in this book.

Examples

Some of the examples lack robustness.

Why do people insist on advocating the DOM as a valid tool for generic XML transformation is beyond my grasp!

The example proposed in the book gets it half right by testing in its main loop whether a child is really an element, but a well-placed comment would still break it when it then blindly assumes that the first child of an element node is the text of the element (a comment or processing instruction would break that assumption). Ironically davorg's own (excellent) Data Munging with Perl had the same problem ;--). Tony Darugar's excellent article Effective XML processing with DOM and XPath in Perl gives a detailed analysis of the kind of problem you run into when using the DOM on real projects.

I gave up testing the examples after a while but I believe that the XML::LibXML example can also be broken with differently formatted XML or extra comments.

The whole chapter (5) advocating the use of XML::Writer above plain print statements completely misses the real reasons why you should use the module (it escapes XML special characters). Instead it focusses on a really contrived discussion against print (and even states that you cannot have a multi-line print, which is false).

Miscellaneous problems

The book gets a host of details wrong, which is not crippling but gets irritating after a while:

The book gives the impression that Perl is a good choice for processing XML because it is very good at processing text. In fact Perl's strength with XML depends mostly on modules... written in C (or based on C parsers)

XML::DOM was NOT written by TJ Mather, but by Enno Derksen. This is even mentionned in the Perl and XML FAQ,

use is not a pragma, strict, in use strict; is a pragma

XML::XPath is listed in the XSLT chapter, with no mention that it is NOT an XSLT processor.

The annex titled "Perl Essentials" promises to be a Perl 101 but only explains how to install Perl modules (it does a good job at explaining it BTW)

Style problems

I am not a fan of giving an entire listing of an example then repeating the example in-extenso, broken-up in commented sections. I'd rather have the complete code only available for download on a web-site and not take up page space (the book web site is not up BTW). As davorg mentioned the use of both DTDs and W3C Schemas is unnecessary, especially as W3C Schemas support in Perl is in its infancy (and did not exist when the book was published, see XML::Schema). Finally I found the tone of the book a little too didactic for me: "I have shown" and "I will demonstrate" are repeated over and over again. The book also goes from using 'I' to using 'we' a couple of times.

In reply to Re: XML and Perl by mirod
in thread XML and Perl by davorg

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.