in reply to Pattern matching: Lazy vs. greedy

Your problem statement doesn't say which "the" and which "dog" you are interested in. For instance, given the input:
The black dog danced around the sleeping dog.

...and the endpoints of "the" and "dog", it seems you want the minimal coverage. Here is where you need some test cases to demonstrate what you will and won't accept.

The way the regex engine works, if it starts to match, say on "the", it will exhaust all options before moving on the the next "the".

One example might be the string where the endpoints are not repeated inside the string. But the following doesn't work:

my $first = "the"; my $last = "dog"; my $string = "The black dog danced around the sleeping dog." my @matches = $string =~ m/\b($first\b(?!.*?$first.*?)\b$last)\b/g;
There doesn't seem to be a good way to say "I don't want $first anywhere in this part", except to do another match. Combine this with Athanasius's solution:
my $first = "the"; my $last = "dog"; my @strings = ("The black dog danced around the sleeping dog.", "The brown bear leaped over the lazy dog."); for my $string (@strings) { my @match = $string =~ m/(?=\b($first\b.*?\b$last)\b)/gi; for my $match (@match) { my @firsts = $match =~ m/\b($first)\b/gi; my @lasts = $match =~ m/\b($last)\b/gi; if ((@firsts == 1) and (@lasts == 1)) { print "$match\n"; } } } # The black dog # the sleeping dog # the lazy dog

-QM
--
Quantum Mechanics: The dreams stuff is made of

Replies are listed 'Best First'.
Re^2: Pattern matching: Lazy vs. greedy
by false_friend (Novice) on Mar 30, 2015 at 12:27 UTC
    Dear QM, Thank you for your suggestion. In the specific case I am working on here, I can’t categorically rule out repetitions of the first word, but I’ll keep your solution in mind.
      ... I can’t categorically rule out repetitions of the first word, ...

      Can you elaborate on the rules or goals you have in mind?

      I would guess something like "shortest matching string" or "string with the smallest number of words" (for some value of $words). It's not necessarily easy to come up with this, but you should be able to list positive and negative examples to help tune the solution.

      And most of us are just nerdy enough to want more specifics so we can solve it, or near enough. (Allowing the dreams of examples and counter-examples to be replaced once again by the more familiar nightmares of github DDOSs or Linus rants.)

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of