in reply to XML and Perl

I must say I did not quite like this book. It is not that it is awful, it's just that it rubbed me the wrong way I guess.

I have objective complaints: some of the examples I have checked are not as robust as they should be, some (minor) facts are wrong, and most importantly some important points are not even discussed. I also have some more subjective problems with this book which overall made it a not-so-enjoyable read.

Overall the book can be useful as a source of commented code examples though, and it explains properly the various processing models available (tree vs event). I just think that it could really have used some more reviewing and editing.

Missed points

The book does not discuss encodings at all. This is a major problem, as in my experience most of the problems beginners have with XML come from misunderstanding encodings. If your data is either US-ASCII or UTF-8 and will be in the future, then encodings won't be too much of a problem. In real life this seems to be rarely the case, so I would expect a book about Perl and XML to give you an idea why your parser dies mysteriously when fed a French name and what to do in this case.

A whole chater is dedicated to modules that interface with XSLT processors but there is no discussion on why you woud want to use XSLT as opposed to Perl, and how to choose when to use one or the other (or use both in cooperation). This is the kind of high level introduction that I would have hoped to find in this book.

Examples

Some of the examples lack robustness.

Why do people insist on advocating the DOM as a valid tool for generic XML transformation is beyond my grasp!

The example proposed in the book gets it half right by testing in its main loop whether a child is really an element, but a well-placed comment would still break it when it then blindly assumes that the first child of an element node is the text of the element (a comment or processing instruction would break that assumption). Ironically davorg's own (excellent) Data Munging with Perl had the same problem ;--). Tony Darugar's excellent article Effective XML processing with DOM and XPath in Perl gives a detailed analysis of the kind of problem you run into when using the DOM on real projects.

I gave up testing the examples after a while but I believe that the XML::LibXML example can also be broken with differently formatted XML or extra comments.

The whole chapter (5) advocating the use of XML::Writer above plain print statements completely misses the real reasons why you should use the module (it escapes XML special characters). Instead it focusses on a really contrived discussion against print (and even states that you cannot have a multi-line print, which is false).

Miscellaneous problems

The book gets a host of details wrong, which is not crippling but gets irritating after a while:

The book gives the impression that Perl is a good choice for processing XML because it is very good at processing text. In fact Perl's strength with XML depends mostly on modules... written in C (or based on C parsers)

XML::DOM was NOT written by TJ Mather, but by Enno Derksen. This is even mentionned in the Perl and XML FAQ,

use is not a pragma, strict, in use strict; is a pragma

XML::XPath is listed in the XSLT chapter, with no mention that it is NOT an XSLT processor.

The annex titled "Perl Essentials" promises to be a Perl 101 but only explains how to install Perl modules (it does a good job at explaining it BTW)

Style problems

I am not a fan of giving an entire listing of an example then repeating the example in-extenso, broken-up in commented sections. I'd rather have the complete code only available for download on a web-site and not take up page space (the book web site is not up BTW). As davorg mentioned the use of both DTDs and W3C Schemas is unnecessary, especially as W3C Schemas support in Perl is in its infancy (and did not exist when the book was published, see XML::Schema). Finally I found the tone of the book a little too didactic for me: "I have shown" and "I will demonstrate" are repeated over and over again. The book also goes from using 'I' to using 'we' a couple of times.

Replies are listed 'Best First'.
Re: Re: XML and Perl
by Matts (Deacon) on Jan 28, 2003 at 11:31 UTC
    Michel,

    Respectfully we all know you have a bee in your bonnet about the DOM ;-). But regardless of the problems processing XML with the DOM brings up, I'm not entirely sure that covering those problems in the presented code would be the best way to do it -- I can't stand books that present massively long examples -- I'd much rather be given something simple I can build on. Perhaps a box-out would be better though. I haven't read this book yet, but I will be sure to suggest that to Ilya as a change for the second edition.

    As far as encodings go, I seem to recall reading that the book covers this by simply stating that all XML parsers return their data in UTF-8, regardless of the input encoding, but then I've only flicked through it on the bookshelves so I can't be sure. I get the feeling that because XML::Twig deals with the encodings issue by messing with the original_string (which is possibly scarier than the alternatives of just leaving things in UTF-8) that you think it's terribly important that this be covered in detail, when I think that in the majority of situations people need to come out of their encoding-specific shells and get used to the world of unicode. I'd treat someone coding in perl4 style the same way.

    Regarding discussing why you would want to use XSLT, I'd rather keep this out of a technical book. This is a wishy washy issue, and I'd rather just get into the code, thanks. I guess mileage varies on this - personally I prefer nutshell-style books that just get down and dirty without any discussion of the whys.

    Overall I think your response is rather damning of what is a much better book than "Perl and XML" from O'Reilly, and given the choice of the two I'd pick this book any day.

    All, respectfully, IMHO ;-)

      Overall I think your response is rather damning of what is a much better book than "Perl and XML" from O'Reilly, and given the choice of the two I'd pick this book any day.

      I agree. I wasn't impressed with O'Reilly's Perl and XML either. I thought it's examples were far too simple and it spent too much time on trivial issues while only providing far too brief discussion of the important points.

      XML and Perl is definately an improvement, despite what opinions people many have on the DOM :)