http://qs1969.pair.com?node_id=576470

chrism01 has asked for the wisdom of the Perl Monks concerning the following question:

Monks
conceptually, i'd like to match any value of string $str1 that starts (exactly) with another string ($str2), both stored in variables eg:
if( $str1 =~ /^$str2/ ) { print "match\n"; } else { print "nomatch\n"; }
Given:
$str2 = ".1.2.3.4";

I'd like it to match if:

$str1 =".1.2.3.4 lhkfjd";

but not if $str1 is any of

"x1.2.3.4 lhkfjd";
" 1.2.3.4 lhkfjd";
"..1.2.3.4 lhkfjd";
etc

Unfortunately, I can't seem to get the syntax quite correct, although i have looked at various perl regex tutorials, inc at perldoc.
Solution with explanation deeply appreciated
Cheers
Chris

Replies are listed 'Best First'.
Re: Regex Start Anchor with variables
by Samy_rio (Vicar) on Oct 05, 2006 at 05:52 UTC

    Hi chrism01, use Quotemeta "\Q & \E".

    if( $str1 =~ /^\Q$str2\E/ ) { print "match\n"; } else { print "nomatch\n"; }

    Updated

    See the documentation in perldoc as quotemeta

    In string2 $str2 = ".1.2.3.4";, dots(.) are present and it will be treated as any character except new line in regular expression. So dots should be escaped using "\Q".

    Regards,
    Velusamy R.


    eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';

Re: Regex Start Anchor with variables
by jwkrahn (Abbot) on Oct 05, 2006 at 05:55 UTC
    You probably don't need a regular expression for this:
    if ( 0 == index $str1, $str2 ) { print "match\n"; } else { print "nomatch\n"; }
    However, if you really want to use a regular expression then you need to escape the meta-characters in $str2:
    if( $str1 =~ /^\Q$str2/ ) { print "match\n"; } else { print "nomatch\n"; }
Re: Regex Start Anchor with variables
by ikegami (Patriarch) on Oct 05, 2006 at 06:13 UTC
    An alternative to \Q is quotemeta.
    my $re = quotemeta($str2); if( $str1 =~ /^$re/ ) { print "match\n"; } else { print "nomatch\n"; }

    index (already shown) is faster.

Re: Regex Start Anchor with variables
by japhy (Canon) on Oct 05, 2006 at 12:20 UTC
    Even faster than index() or a regex -- I would imagine -- would be substr() and eq together.
    if (substr($large, 0, length($small)) eq $small) { ... }

    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
      Even faster than index() or a regex -- I would imagine -- would be substr() and eq together.

      When imagine when you can Benchmark?

      use Benchmark 'cmpthese'; my $str1 = 'foobarbazqux'; my $str2 = 'foobar'; my $str2len = length $str2; cmpthese (3e6, { regex => sub { $str1 =~ /\Q$str2/; }, regex2 => sub { $str1 =~ /\Q$str2/o; }, ## probably useless test, +see japhy's comment below index => sub { index $str1, $str2; }, substr => sub { $str2 eq substr $str1, 0, length $str2; }, substr2 => sub { $str2 eq substr $str1, 0, $str2len; }, }); __END__ Rate regex regex2 substr substr2 index regex 1063830/s -- -13% -25% -43% -72% regex2 1229508/s 16% -- -14% -34% -68% substr 1421801/s 34% 16% -- -23% -63% substr2 1851852/s 74% 51% 30% -- -51% index 3797468/s 257% 209% 167% 105% --

      The index solution is clearly superior to the others. It's interesting to see the improvements of regex2 and substr2 over their unoptimized counterparts.

      Update: I forgot that the OP was matching at the beginning of the string. This is a better benchmark:

      use Benchmark 'cmpthese'; my $str1 = 'foobarbazqux'; my $str2 = 'foobar'; my $str2len = length $str2; cmpthese (3e6, { regex => sub { $str1 =~ /^\Q$str2/; }, index => sub { 0 == index $str1, $str2; }, substr => sub { $str2 eq substr $str1, 0, length $str2; }, substr2 => sub { $str2 eq substr $str1, 0, $str2len; }, }); __END__ Rate regex substr substr2 index regex 671141/s -- -59% -65% -71% substr 1630435/s 143% -- -16% -30% substr2 1935484/s 188% 19% -- -17% index 2343750/s 249% 44% 21% --

      --
      David Serrano

        I wouldn't bring /o into the equation. It makes a regex "sterile" which wouldn't be helpful if this were in a function where the strings are arguments.

        You haven't provided any data for failing cases. The further the smaller string is from the beginning of the bigger string, the slower index() will be.


        Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
        How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
Re: Regex Start Anchor with variables
by chrism01 (Friar) on Oct 05, 2006 at 06:05 UTC
    Hey guys,
    Thx for that. Most examples use hardcoded regexes (for ease of reading I guess), so I couldn't figure it out.
    Looks like \Q was what I needed.
    Chris
      \Q or quotemeta is indeed what you needed.

      But your problem is not connected with hardcoded regexes vs. regexes in variables. As Samy_rio correctly pointed out, the problem comes from the fact that '.' has a special meaning in a regex which you don't want in your case. Therefore you need to escape it: either by putting a '\' in front of it - which works fine for the hardcoded case an is a bit more complicated for variables OR you use quotemeta resp. \Q which works in both cases.

      -- Hofmator

      Code written by Hofmator and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.