comment on

Hear, hear.

The few times I have had to deal with XML, I find that people tend to pay lip service to it, and manage to emit badly formed XML far more often than they get it right. Lone & characters in text being the worst offense. In order to use XML parsing tools, you first have to run a cleanup script over the received data so that the tools don't curl up and die.

Furthermore, the XML in question is usually being emitted from an old program that has been modified to produce XML today, when in the past it was producing plain old data. By extension, it means that XML you get to deal with has a rigid structure, not at all free-form as the spec might make you think.

I would hazard a bet and say that the majority of XML used is to get one system to speak to another system. I would guess that the number of instances where one system has to deal with incoming XML instance from multiple sources is quite small in comparison.

If you are in the position of getting data from one system to another you usually have control over how and when the format is changed. When you have that much control over the environment, simple methods suffice.

For instance, to paraphrase some old code I have, you can get a lot of mileage out of Perl's wonderful ... operator (not to be confused with ..).

#! /usr/bin/perl -w

use strict;

my @stuff = grep { /<emp>/ ... /<\/emp>/ } <DATA>;

__DATA__
<profile>
<emp>
<name>Mahesh</name>
<age>24</age>
<address>New york</address>
<desig>Developer</desig>
</emp>
</profile>
<junk>
<morejunk />
</junk>
<profile>
<emp>
<name>Mahesh2</name>
<age>242</age>
<address>New york2</address>
<desig>Developer2</desig>
</emp>
</profile>
[download]

You might ask what happens when a new element is added. Well, surprise! you will be obliged to modify your script that parses XML too, if you want to do anything with it.

Don't get me wrong, I am a big fan of XML, but I think it suffers from too much hype. People seem to be happy to use it even when simpler methods exist.

print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

In reply to Re:x3 Unable to get more than one line (all praise the ... operator) by grinder
in thread Unable to get more than one line by winefm

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.