in reply to Non-exclusive or in regexp?

My first thought was:

$text =~ s/(?:national security adviser |dr\. |doctor |condoleeza )+ri +ce/condoleezarice/ig;

But that is not what you want. You want exact. So ... we need to get down to the tricky stuff ... ;-)

$text =~ s/(?!rice) (?:dr\.\ |doctor\ )? (?:condoleeza\ )? rice /condoleezarice/igx;

I just love trickery! :-)

print "Just another Perl ${\(trickster and hacker)},"
The Sidhekin proves Sidhe did it!

Replies are listed 'Best First'.
Re^2: Non-exclusive or in regexp?
by tachyon (Chancellor) on Sep 06, 2004 at 02:19 UTC

    This is a very neat trick. It took a few moments to work out how it works ++

    cheers

    tachyon

Re^2: Non-exclusive or in regexp?
by cormanaz (Deacon) on Sep 06, 2004 at 12:04 UTC
    Well it works but I'm clearly way out-monked here because I don't have a clue how! Can someone enlighten me?
      Well it works but I'm clearly way out-monked here because I don't have a clue how! Can someone enlighten me?

      The trick lies in the zero-width assertion: It will fail if none of the following optionals match anything. (It asserts no rice, but if none of the optionals match, rice is what follows.)

      I guess my answer was rather on the minimalist side. So, here is a commented version. But I won't just comment the same version I made first -- what's the fun in that? This time I'll use a positive lookahead zero-width assertion, not a negative. And I'll borrow the variables from tachyon's answer, including the job description that I somehow forgot first time around.

      So, here goes:

      # Note, tachyon, we need the /i modifier on each of these as well: my $job_desc = qr/\b national\s+security\s+adviser \s+/ix; my $title = qr/\b (?:dr\.|doctor) \s+/ix; my $f_name = qr/\b condoleeza \s+/ix; my $surname = qr/\b rice \b/ix; my $repl = 'condoleezarice'; $text =~ s/# Zero-width assertion: One of the options must follow: (?= $job_desc | $title | $f_name ) $job_desc ? # Option 1: job description $title ? # Option 2: title $f_name ? # Option 3: first name $surname # Surname -- not optional /$repl/gx;

      (If the surname could match any of the optionals, this would not work. Neither would the negative lookahead, though it would fail differently. But that is a problem I'll face no sooner than I need to.)

      Update: Turns out there is an implied grouping with qr//, so I have removed the explicit grouping from my code.

      print "Just another Perl ${\(trickster and hacker)},"
      The Sidhekin proves Sidhe did it!