in reply to XML Parsing

I would really recommend a module for this sort of thing. For example using XML::Simple:
use strict; use warnings; use XML::Simple; use Data::Dumper; my $string = do { local $/; <DATA>}; my $ref = XMLin($string); print Dumper $ref; my $num_events = @{$ref->{EVENT}}; print "There are $num_events events listed\n"; __DATA__ <ROOT> <EVENT> <NAME>test2</NAME> <LOCATION>iwu</LOCATION> <TIME>now</TIME> <DATE>today</DATE> <PRIORITY>interest</PRIORITY> <ATTENDEES>a lot</ATTENDEES> <DESCRIPTION> descrip</DESCRIPTION> </EVENT> <EVENT> <NAME>test3</NAME> <LOCATION>hi</LOCATION> <TIME>joe</TIME> <DATE>how</DATE> <PRIORITY>interest</PRIORITY> <ATTENDEES>are</ATTENDEES> <DESCRIPTION> </DESCRIPTION> </EVENT> </ROOT>
As for the code you posted you need to add the /s modifier or the .'s will not match the newline characters (will not cross over lines). If it were me and was doing a quick hack I would probably still use XML::Simple, but as for changing your regex to capture multiple matches you might try something like the following:
my @events; while ($page_body =~ /<EVENT>(.*?)<\/EVENT>/sg){ push @events, $1 #note $1 might have zero length. }
Also note that it is pointless to have .*? at the very start of a regular expression as it will cause a lot of of needless backtracking, and never really match anything, as a regex looks for a pattern anywhere in the string (lest it be anchored)

-enlil

Replies are listed 'Best First'.
2Re: XML Parsing
by jeffa (Bishop) on Apr 24, 2004 at 14:12 UTC

    Just a little nitpick. Replace this:

    my $string = do { local $/; <DATA>}; my $ref = XMLin($string);
    With this:
    my $ref = XMLin(\*DATA);

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
Re: Re: XML Parsing
by JoeJaz (Monk) on Apr 24, 2004 at 10:55 UTC
    Thanks a lot for your advice. I will have a study at the XML::Simple module and see what it has to offer. The note about the .*? is good to know. I surely don't want my code needlessly using CPU cycles. That above code snippet is precicely what I was trying to do. Thanks again. Joe