Problem:
You want to perform a pattern match while constraining the match to the first half of the string being tested.

The rules:
Because of a sick mental condition you decide not to use substr. Furthermore, you decide that it must be a pure regexp approach, though experimental (?{...}) features are permissible.

This insane rule prevents you from using the following two perfectly good possibilities:

# Rejected possibility one: substr( $string, 0, length( $string ) / 2 ) =~ m/pattern/; # Rejected possibility two: my $half = length( $string ) / 2; $string =~ m/pattern.{$half}/s;

Yes, those will both work, but where's the fun in that?!

A solution:
This tender unsavory morsel will constrain matching to the first half of the string being tested. It's a regexp approach, in that the expression could stand alone without the aid of external calculations or flags. Here goes:

use strict; # Because you should. use warnings; # Because you want to. use re 'eval'; # Take your life in your own hands. my $find = qr/123/; # This is what we're searching for. while ( <DATA> ) { chomp; if ( m/$find # Match the substring. (?{ (pos()<=length($_)*.5) # Test to see if our match # occurred in the first half # of the string. ? '' # Yes: pass a subexpression # that can't fail. : '\w\b\w' # No: pass a subexpression # that always fails. }) (??{$^R}) # Evaluate the passed # subexpression. /x ) { print "$_ matched.\n"; } else { print "$_ didn't match.\n"; } } __DATA__ 1230000000 0123000000 0012300000 0001230000 0000123000 0000012300 0000001230 0000000123

The output will be...

1230000000 matched. 0123000000 matched. 0012300000 matched. 0001230000 didn't match. 0000123000 didn't match. 0000012300 didn't match. 0000001230 didn't match. 0000000123 didn't match.

Caviats:
This solution is pretty messy compared to the simple substr solution: It's mostly a solution to a nonexistent problem. But it's a proof of concept resulting in just some personal tinkering... a Perl meditation. ;)

It would be a whole lot simpler if the (?{...}) construct could cause a match to succeed or fail based on the return value of the code it contains. ...but to my knowlege no such provision exists. At least $^R allows the return value of the code to be passed to (??{...}); an assertion that can cause matches to succeed or fail.

Thanks for listening to my meditation. I hope it prompts additional discussion on the interesting topic of Perl's experimental regexp features.

Update:
I figured out how to use the (?(?{...}) ex | ex ) construct, and it simplifies my original regexp by eliminating the need for  (??{$^R}). The principle is roughly the same though. Here it is:

use strict; use warnings; use re 'eval'; my $find = qr/123/; while ( <DATA> ) { chomp; if ( m/$find(?(?{(pos()<=length($_)*.5)})|\w\b\w)/ ) { print "$_ matched.\n"; } else { print "$_ didn't match.\n"; } } __DATA__ 1230000000 0123000000 0012300000 0001230000 0000123000 0000012300 0000001230 0000000123

Enjoy!


Dave

Replies are listed 'Best First'.
Re: Perl's regexp (?{...}) construct and constraining matches.
by diotalevi (Canon) on Aug 14, 2004 at 11:24 UTC

    Use (?=) for success and (?!) for failure. That is easier to understand and more direct than \w\b\w. You end up with (?(?{ PERL-CODE }) (?=) | (?!) ). You can write only the failure condition by negating the result of the perl code: (? (?{ not PERL-CODE }) (?!) )

      Thanks diotalevi, I was struggling to come up with guaranteed failure condition. \w\b\w works, but your solution is more elegant.

      Implementing your suggestions, the regexp now looks like:

      m/$find(?(?{not(pos()<=length($_)*.5)})(?!))/

      Starting to look a little saner.


      Dave

        Of course, as with everything else, sanity is relative… :-)

        Makeshifts last the longest.

        You know, would it kill you to call your mother? She's been waiting to tell you your regexes need more whitespace.