Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a question reg. the XML::Parser module. In my code I have registered the start tag. I understand that the parameters to start_handler are expat object, the name of the e element and a hash containing attributes of this element. How do I get the value associated with this element. Please see my QUES in the example below.

e.g.

<Orders> <Order ID="0008" Date="11/14/1999"> <Item>A Book</Item> </Order> </Orders> -- $parser->setHandlers(Start => \&start_handler ); sub start_handler() { my ($p, $elmt ) = @_; if ( $elmt eq 'Item' ) { Ques - I want to get 'A Book' here. I want to call a new procedure with param(elmt, A book ) } }
QUES 2: IS there any other way to register for a particular element (e.g Item) so that I get a call back only when 'Item' is encountered ?

So,in other words, can I define: $parser->setHandlers(Entity => \&entity_handler) and in entity_handler procedure I can get entity and value too. This does not seem to work.

thanks
jaya

Edit by dws to add <code> tags

Replies are listed 'Best First'.
Re: XML::Parser start handler
by grantm (Parson) on Sep 27, 2002 at 04:31 UTC

    If you're writing new code, then you should probably be using XML::SAX rather than XML::Parser. (If you're maintaining code that already works with XML::Parser then carry on).

    If you write to the SAX API, then you can use whichever parser happens to be installed (eg: XML::LibXML for blinding speed, XML::SAX::Expat for oldtimes sake, or XML::SAX::PurePerl for maximum portability). You can also use or write SAX filters to make your code more modular.

    Kip Hampton has written a number of tutorials relating to SAX here, here and here (all at XML.com).

Re: XML::Parser start handler
by mirod (Canon) on Sep 27, 2002 at 02:44 UTC

    You should have a look at the various nodes about XML::Parser on the site (use the site search), starting with XML::Parser Tutorial and the review.

    Here is a somewhat clean (except that it uses global variables) way to use XML::Parser for your purpose:

    #!/usr/bin/perl -w use strict; use XML::Parser; my( $in_item, $item); my $parser= XML::Parser->new(); $parser->setHandlers( Start => \&start_handler, Char => \&char_handler, End => \&end_handler, ); $parser->parse( \*DATA); sub start_handler() # raises a flag when getting to Item { my ($p, $elmt ) = @_; if ( $elmt eq 'Item' ) { $in_item=1; $item=''; # reset the item text } } sub char_handler # stores the item text in $item { my( $p, $text)= @_; if( $in_item) { $item .= $text; } } sub end_handler # processes the item and lowers the flag { my ($p, $elmt ) = @_; if ( $elmt eq 'Item' ) { $in_item=0; # Toto, we are not in Item anymore # this is where you call the Item processing procedure print "item: $item\n"; } } __DATA__ <Orders> <Order ID="0008" Date="11/14/1999"><Item>A Book</Item></Order> </Orders>

    Now as for your second question, XML::Parser's Subs style would help, or you could just use XML::Twig:

    #!/usr/bin/perl -w use strict; use XML::Twig; my $twig= XML::Twig->new( twig_handlers => { Item => sub { print $_->t +ext, "\n"; } }); $twig->parse( \*DATA); __DATA__ <Orders> <Order ID="0008" Date="11/14/1999"><Item>A Book</Item></Order> </Orders>

    BTW, I was confused by the use of Entity in your question. Entity already has a meaning in XML: &lt; is an entity.

Re: XML::Parser start handler
by blm (Hermit) on Sep 27, 2002 at 01:27 UTC

    The way to get the data is to define another handler for Char. This is the line I would use:

    $parser->setHandlers(Start => \&start_handler, Char => \&handle_char);
    Somewhere else:
    sub handle_char { my ($expat, $string) = @_; ... }
    I implement XML parsing as a state machine where a flag gets raised when I go through the start handler for a paricular element and gets lowered when I go through the end handler for that element. I know then when I am in the data handler and the flag is raised I am seeing the data for that element.

    I don't know that there is any other way to register for a particular element. I believe that an entity is a thing like &amp; or &lt; or &nbsp;

    --blm-- If you don't like this post can you please /msg me