in reply to Need RegExp help - doing an AND match
This works okay, though how it fairs performance wise compared with other methods I'm not sure.
#! perl -slw use strict; sub reAnd{ my $re = ''; $re .= '(?=^.*\b' . quotemeta() . '\b)' for @_; return qr[$re]; } my @words = qw[ an of and ]; my $re1 = reAnd( @words ); #print $re1; my $re2 = reAnd( qw[ a great sweet mother by the wellfed voice beside +him ] ); #print $re2; while( <DATA> ) { m[$re1]i and print "1:$_"; m[$re2] and print "2:$_"; } __DATA__ Stephen, an elbow rested on the jagged granite, leaned his palm agains +t his brow and gazed at the fraying edge of his shiny black coat-sleeve. Pain, that was not yet the pain of love, fretted his heart. Silently, in a dream she had come to him after her death, her wasted body within its loose brown graveclothes giving off an odour of wax and rosewood, her breath, that had bent upon him, mute, reproachful, a faint odour o +f wetted ashes. Across the threadbare cuffedge he saw the sea hailed as a great sweet mother by the wellfed voice beside him. The ring of bay and skyline held a dull green mass of liquid. A bowl of white china ha +d stood beside her deathbed holding the green sluggish bile which she ha +d torn up from her rotting liver by fits of loud groaning vomiting.
Prints
C:\test>624296.pl 1:its loose brown graveclothes giving off an odour of wax and rosewood +, 2:a great sweet mother by the wellfed voice beside him. The ring of ba +y
The basic mechanism is to use regex of the form (?=^.*\bword\b). That is, a positive lookahead assertion that reads: Starting at the begining of the line, skip as much of anything as need to try and locate the word 'word', delimited by word/nonword transitions. (\b).
As these are zero length assertions, they do not advance the matchpoint, so adding a second one again starts from the beginning of the string. This gives the ability to match any number of words in any order. If they all match, the regex succeeds and the AND operation is achieved.
By generating the regex in a sub, the 'horrors' of the 'bunch of regex' can be hidden from the squeamish.
Add /i to the use of the generated regex if you need case independant matching.
If you omit the ^, then the lookaheads will continue from the current pos, and so you can append the AND operation to longer regex. However, continuing to match after the successful match is more involved.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Need RegExp help - doing an AND match
by Anonymous Monk on Jul 01, 2007 at 19:56 UTC | |
by Corion (Patriarch) on Jul 01, 2007 at 20:03 UTC | |
by Anonymous Monk on Jul 01, 2007 at 20:30 UTC | |
by parv (Parson) on Jul 01, 2007 at 21:12 UTC | |
by Anonymous Monk on Jul 01, 2007 at 21:06 UTC | |
|
Re^2: Need RegExp help - doing an AND match
by john_oshea (Priest) on Jul 02, 2007 at 12:01 UTC |