comment on

fellow monks,

I've recently had a need to work with xml. This is my first trek into this so please bear with me. After reviewing some of the previous posts on this site, I decided to use the XML::Parser module. I started by simply trying to obtain and print the value for the tag "say."

The xml:

<?xml version="1.0" encoding="ISO-8859-1"?>
<monk value="PM">
 <say>JAPH</say>
  <vals>
   <val val1="F" val2="value f">
   </val>
   <val val1="FO" val2="value fo">
   </val>
   <val val1="FOO" val2="value foo">
   </val>
   <val val1="FOOB" val2="value foob">
   </val>
   <val val1="FOOBA" val2="value fooba">
   </val>
   <val val1="FOOBAR" val2="value foobar">
   </val>
  </vals>
</monk>
[download]

The code:

#!perlenv -w

use strict;
use XML::Parser;

my $xp;
my $japh;

$xp = new XML::Parser(
    Handlers => {
        Start => \&start_handler,
        End   => \&end_handler,
        Char  => \&char_handler
    }
);

if ( $#ARGV < 0 ) {
    print "usage: blah <xml file>";
    exit;
}

$xp->parsefile( $ARGV[0] );

sub start_handler {
    my ( $xp, $elem ) = @_;
    if ( $elem eq 'say' ) {
        $japh = 1;
    }
}

sub end_handler {
    my ( $xp, $elem ) = @_;
    if ( $elem eq 'say' ) {
        $japh = 0;
    }
}

sub char_handler {
    my ( $xp, $str ) = @_;
    if ($japh) {
        $japh = $str;
        print $japh . "\n";
    }
}
[download]

The overall goal however is to print the value for "say" only if val1 can be found to be equal to "FOO." After a few days of unsucessfull attempts with this, I went to regex just to have a working solution. The following code snip gives me what exactly what I need:

if ( $string[0] =~ /xml/ ) {
    foreach $string (@string) {
        
    if ( $string =~ m/<say>/ ) {
        $say = $string;
    }
    
    if ( $string =~ m/(<val val1="$val1")/ ) {
        $say =~ s/\s<say>//;
        $say =~ s/<\/say>//;
        print $say . "\n";
        $found = 1;
    }

    if ( $string =~ m/<\/monk>/ ) {
    $say = "";
    }
    }
}
[download]

My questions are as follows: Being that regex gives me exactly what I need (in this particular case), should I be concerned with not using a XML parser? Is XML::Parser the right module to use in a case such as this? Just to be proper, I would like to use a XML parser when working with data such as this. Any suggestions that will point me in the right direction will be greatly appreciated.

cheers, -semio

Edit by tye to add READMORE tag

In reply to XML::parser question by semio

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.