in reply to Re: Re: html page search/parse
in thread html page search/parse
Sorry, my bad. Try this.
#! perl -slw use strict; my $re = qr[ <!--QBlastInfoBegin # Match the start of comment \s+ # 1 or more whitespace including newlines RID # 'RID' literal \s+ # One or more whitespace = # '=' \s+ # more whitespace ( # start capturing to $1 [\d-]+ # 1 or more '0-9' or '-' ) # end capture \s+ # yet more whitespace RTOE # 'RTOE' literal \s+ # And more whitespace = # '=' literal \s+ # more ( # start capture to $2 \d+ # 1 or more digits ) # end capture \s+ # more whitespace QBlastInfoEnd # the end token \s+ # final whitespace (including newlines) --> # The end comment card ]x; # Ignore incidental spacing and comments in + regex. my $html = do{ local $/; <DATA> }; Grab the data from <DATA> into a st +ring my( $RID, $RTOE ) = $html =~ $re; # Execute the regex and assign the c +aptures to variables. print "RID:$RID RTOE:$RTOE"; # Print the results. __DATA__ <!--QBlastInfoBegin RID = 1055976860-01972-17207 RTOE = 7 QBlastInfoEnd -->
Without the verbose commenting, the (now tested and working) regex looks like this
my $re = qr[ <!--QBlastInfoBegin \s+ RID \s+ = \s+ ( [\d-]+ ) \s+ RTOE \s+ = \s+ ( \d+ ) \s+ QBlastInfoEnd \s+ --> ]x;
The +'s mean match 1 or more of the preceeding element. See perlre and perlretut for more.
|
|---|