Re: Re: html page search/parse

Replies are listed 'Best First'.
Re: Re: Re: html page search/parse by BrowserUk (Patriarch) on Jun 19, 2003 at 00:09 UTC
Sorry, my bad. Try this. #! perl -slw use strict; my $re = qr[ <!--QBlastInfoBegin # Match the start of comment \s+ # 1 or more whitespace including newlines RID # 'RID' literal \s+ # One or more whitespace = # '=' \s+ # more whitespace ( # start capturing to $1 [\d-]+ # 1 or more '0-9' or '-' ) # end capture \s+ # yet more whitespace RTOE # 'RTOE' literal \s+ # And more whitespace = # '=' literal \s+ # more ( # start capture to $2 \d+ # 1 or more digits ) # end capture \s+ # more whitespace QBlastInfoEnd # the end token \s+ # final whitespace (including newlines) --> # The end comment card ]x; # Ignore incidental spacing and comments in + regex. my $html = do{ local $/; <DATA> }; Grab the data from <DATA> into a st +ring my( $RID, $RTOE ) = $html =~ $re; # Execute the regex and assign the c +aptures to variables. print "RID:$RID RTOE:$RTOE"; # Print the results. __DATA__ <!--QBlastInfoBegin RID = 1055976860-01972-17207 RTOE = 7 QBlastInfoEnd --> [download] Without the verbose commenting, the (now tested and working) regex looks like this `my $re = qr[ <!--QBlastInfoBegin \s+ RID \s+ = \s+ ( [\d-]+ ) \s+ RTOE \s+ = \s+ ( \d+ ) \s+ QBlastInfoEnd \s+ --> ]x;` [download] The +'s mean match 1 or more of the preceeding element. See perlre and perlretut for more. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller	[reply] [d/l] [select]
Re: Re: Re: html page search/parse by The Mad Hatter (Priest) on Jun 18, 2003 at 23:45 UTC
He didn't allow for spaces between the two equals signs. Try this version: `my $re = qr[<!--QBlastInfoBegin \s+ RID \s* = \s* ([\d-]+) \s+ RTOE \s* = \s* (\d+) \s+ QBlastInfoEnd \s+ -->]x;` [download] As for the pluses, they are quantifiers and make the expression match one or more spaces (in this case). See `perldoc perlre` for more info.	[reply] [d/l] [select]