fellow monks,

I've recently had a need to work with xml. This is my first trek into this so please bear with me. After reviewing some of the previous posts on this site, I decided to use the XML::Parser module. I started by simply trying to obtain and print the value for the tag "say."

The xml:

<?xml version="1.0" encoding="ISO-8859-1"?> <monk value="PM"> <say>JAPH</say> <vals> <val val1="F" val2="value f"> </val> <val val1="FO" val2="value fo"> </val> <val val1="FOO" val2="value foo"> </val> <val val1="FOOB" val2="value foob"> </val> <val val1="FOOBA" val2="value fooba"> </val> <val val1="FOOBAR" val2="value foobar"> </val> </vals> </monk>
The code:
#!perlenv -w use strict; use XML::Parser; my $xp; my $japh; $xp = new XML::Parser( Handlers => { Start => \&start_handler, End => \&end_handler, Char => \&char_handler } ); if ( $#ARGV < 0 ) { print "usage: blah <xml file>"; exit; } $xp->parsefile( $ARGV[0] ); sub start_handler { my ( $xp, $elem ) = @_; if ( $elem eq 'say' ) { $japh = 1; } } sub end_handler { my ( $xp, $elem ) = @_; if ( $elem eq 'say' ) { $japh = 0; } } sub char_handler { my ( $xp, $str ) = @_; if ($japh) { $japh = $str; print $japh . "\n"; } }
The overall goal however is to print the value for "say" only if val1 can be found to be equal to "FOO." After a few days of unsucessfull attempts with this, I went to regex just to have a working solution. The following code snip gives me what exactly what I need:
if ( $string[0] =~ /xml/ ) { foreach $string (@string) { if ( $string =~ m/<say>/ ) { $say = $string; } if ( $string =~ m/(<val val1="$val1")/ ) { $say =~ s/\s<say>//; $say =~ s/<\/say>//; print $say . "\n"; $found = 1; } if ( $string =~ m/<\/monk>/ ) { $say = ""; } } }
My questions are as follows: Being that regex gives me exactly what I need (in this particular case), should I be concerned with not using a XML parser? Is XML::Parser the right module to use in a case such as this? Just to be proper, I would like to use a XML parser when working with data such as this. Any suggestions that will point me in the right direction will be greatly appreciated.

cheers, -semio

Edit by tye to add READMORE tag


In reply to XML::parser question by semio

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.