in reply to Re^2: Need a Regular Expression that tests for words in different order and captures the values found.
in thread Need a Regular Expression that tests for words in different order and captures the values found.

Hmmmm....good point. Well, you can fix part of it with (?=.*\bfred\b (\w+)), but avoiding matching none of the targets is still a problem. The way I suggested will always get three matches because it says find "zero or more of this". That means either you can't actually find out how many non-empty matches you got, or you can only match when all the targets are present. Separate matches in a loop, as suggested by several others, is the way to go. Here's my take, redux:

$string = "This is bLarney rubble and his friends joe rockhead and fre +d flintstone"; $count = 0; for $target (qw(fred barney joe)) { if ( $string =~ /(?=.*\b$target (\w+))/i ) { push @elements, $1; $count++; } else { push @elements, ''; # as a placeholder } } if ($count >= 2) { print join('_', @elements), "_inc\n" } else { print "Didn't find at least 2 elements in the strin +g\n" } # prints flintstone__blockhead_inc # change 'joe' to 'moe' and you get > Didn't find at least 2 elements +in the string

There ought to be something useful in there. :-)

--marmot
  • Comment on Re^3: Need a Regular Expression that tests for words in different order and captures the values found.
  • Select or Download Code

Replies are listed 'Best First'.
Re^4: Need a Regular Expression that tests for words in different order and captures the values found.
by AnomalousMonk (Archbishop) on Jan 15, 2010 at 20:11 UTC

    I missed the '2 or 3' requirement on first reading of the OP. For the sake of maintainability if nothing else, a looping (loopy?) approach may, as you say, be the way to go.

    However, there is a simple way to deal with the undefined values produced by zero-quantified captures:

    >perl -wMstrict -le "my $bound = qr{ (?<! [\w-]) }xms; my $A = qr{ (?= .* $bound A \s+ (\w+)) }xms; my $B = qr{ (?= .* $bound B \s+ (\w+)) }xms; my $C = qr{ (?= .* $bound C \s+ (\w+)) }xms; my $extract = qr{ \A $A? $B? $C? }xms; print '-------------------'; for my $line (@ARGV) { my $s = join '_', my @got = grep defined, $line =~ $extract; $s = 'no match' if @got < 2 or @got > 3; print qq{'$line': '$s'}; } " "B Bee C Cee A Aye" "foo C Cee bar A Aye baz B Bee zzz" "A Aye B Bee +" "C Cee foo B Bee" "xxx C Chuck yyyy A Able zzz" "A Aye A Aye B Bee" foo "A Aye" "A Aye pseudo-B Bee" "A Aye XYZB Bee" "A Aye A Aye A A +ye" ------------------- 'B Bee C Cee A Aye': 'Aye_Bee_Cee' 'foo C Cee bar A Aye baz B Bee zzz': 'Aye_Bee_Cee' 'A Aye B Bee': 'Aye_Bee' 'C Cee foo B Bee': 'Bee_Cee' 'xxx C Chuck yyyy A Able zzz': 'Able_Chuck' 'A Aye A Aye B Bee': 'Aye_Bee' 'foo': 'no match' 'A Aye': 'no match' 'A Aye pseudo-B Bee': 'no match' 'A Aye XYZB Bee': 'no match' 'A Aye A Aye A Aye': 'no match'
      Nice. Had to ponder that a moment to follow it. :-)