Isanchez has asked for the wisdom of the Perl Monks concerning the following question:
HI
I have a problem with the following script that needs to grab all instances of specific airplane names inside of tags that have a number reference at the end. I know that the number reference should have been in the starting tag... but can anyone check please if there is a way of making this code work
a million thanks Monks
$in = "captionOutTagged.xml"; open (IN, $in) or die "can't open the infile $in \n"; ##### while (not eof (IN)){ $line = <IN>; chomp $line; #print "$line\n\n"; # airplanes models: have a digit at the end of second tag if ( @terms = $line =~ /\<M\>(.*?)\<\/M\d+?\>/gix ) { # print "=**$1**=\n"; } # no number then avionics general terms elsif (@avionics = $line =~ /\<M\>(.+?)<\/'M'\>/gix) { #print " LINE: $line\n"; #print "$1\n"; } ########## foreach $term (@terms){ print "$term\n"; } foreach $avionic (@avionics){ print "$avionic\n"; } ######### } # end
America's first <M>swept-wing</M>, <M>multiengine jet</M> <M>bomber</M> was the <M>B-47 Stratojet</M200>, and the first <M>swept-wing fighter</M> was the <M>F-86 Sabre Jet</M201>. Both used new swept-wing data found in Germany after <M>World War II</M> and sent back to the United States by American scientists. This photograph, from <D>1951</D>, was taken the first time the two flew together over <PL>Kansas</PL>.
curent output:
swept-wing</M>, <M>multiengine jet</M> <M>bomber</M> was the <M>B-47 Stratojet swept-wing fighter</M> was the <M>F-86 Sabre Jet
desired autput:B-47 Stratojet
F-86 Sabre Jet
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Regexp Problem with greedy
by Enlil (Parson) on Sep 10, 2003 at 23:54 UTC | |
Re: Regexp Problem with greedy
by asarih (Hermit) on Sep 11, 2003 at 00:08 UTC | |
by Isanchez (Acolyte) on Sep 11, 2003 at 00:12 UTC | |
Re: Regexp Problem with greedy
by pzbagel (Chaplain) on Sep 10, 2003 at 23:44 UTC | |
Re: Regexp Problem with greedy
by davido (Cardinal) on Sep 10, 2003 at 23:52 UTC |