No xml module please

DS has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: No xml module please by Ovid (Cardinal) on Jul 18, 2002 at 15:54 UTC
No offense, but whenever I see someone say "I don't want to use a module for this", it suggests to me that they know a module is available but they just don't want to bother to learn it. I could go on about the issues with that, but most of us have heard them sooooo many times. Either you're willing to do things well or you want to scrape by. I don't mind someone "scraping by" (so long as they're not a co-worker), but to come out and tell someone that you are not willing to consider valid responses which are typically more likely to be correct... Virtually any trivial regex solution that you will be is going to choke on embedded newlines (if any) or if the "desc" (doubtless human entered) ever contains angle brackets: `<desc>Use only with "<code>" tags</desc>` Further, a good XML module based solution will more likely handle changes to the XML in the future. In other words, if someone reorders those tags or adds more tags in the future, a regex solution will break (regexes match text) and an XML solution will be more likely to work (because it will parse the text). This may be a rare case, admittedly, but if you learn how to do it correctly now, that's another tool under your belt (and another potential line on your resume). By spending an extra five minutes now, you know your code is more likely to work in the future and you're lazy enough that you don't want to go back and fix it. Remember, laziness is a Perl virtue. Cheers, Ovid Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.	[reply] [d/l]
A reply falls below the community's threshold of quality. You may see it by logging in.
Re: No xml module please by kvale (Monsignor) on Jul 18, 2002 at 15:48 UTC
Parsing XML is best done with the aid of one of the XML modules, such as XML::Simple. But if you are allergic to modules and your string always has the fixed format shown above, you can use quick and dirty shortcuts that will bite you when the format cahnges :) A hack such as this will do the trick: `my $string = "<message><file>D:\linkctltr.cxx</file><line>68</line><ty +pe>Note</type><codee>970</codee><desc>Use outside of a typedef</desc> +</message>"; $string =~ /<file>([^<]+)/; my $file = $1; $string =~ /<line>([^<]+)/; my $line = $1; $string =~ /<type>([^<]+)/; my $type = $1; $string =~ /<codee>([^<]+)/; my $codee = $1; $string =~ /<desc>([^<]+)/; my $desc = $1;` [download] -Mark	[reply] [d/l]
Re: Re: No xml module please by Ovid (Cardinal) on Jul 18, 2002 at 16:04 UTC
kvale: if you must introduce this person to the evil of regexes ;), at least watch out for a subtle bug that you've introduced. What happens if your first match succeeds and the rest fail? The subsequent variables will be set to the first one's value, thus hampering error checking. I would rewrite that as follows: `my ($file) = $string =~ /<file>([^<]+)/; my ($line) = $string =~ /<line>([^<]+)/; my ($type) = $string =~ /<type>([^<]+)/; my ($codee) = $string =~ /<codee>([^<]+)/; my ($desc) = $string =~ /<desc>([^<]+)/;` [download] That should be a bit safer. Or better yet, go with a hash: `my %info; # a better name should be picked foreach ( qw/ file line type codee desc / ) { my ($info{$_}) = $string =~ /<\Q$_\E>([^<]+)/; }` [download] Cheers, Ovid Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.	[reply] [d/l] [select]
Re: Re: Re: No xml module please by kvale (Monsignor) on Jul 18, 2002 at 19:25 UTC
Good point. My code will work for the XML as given, but there is no good reason to make brittle code yet more brittle. Thanks for the improvements. -Mark	[reply]
Re: No xml module please by aersoy (Scribe) on Jul 18, 2002 at 15:56 UTC
Hello, Why don't you just remove them with a simple code like this: `s{^<message>\|</message>$}{}g;` [download] -- Alper Ersoy	[reply] [d/l]