http://qs1969.pair.com?node_id=576517


in reply to Regex Start Anchor with variables

Even faster than index() or a regex -- I would imagine -- would be substr() and eq together.
if (substr($large, 0, length($small)) eq $small) { ... }

Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart

Replies are listed 'Best First'.
Re^2: Regex Start Anchor with variables
by Hue-Bond (Priest) on Oct 05, 2006 at 13:26 UTC
    Even faster than index() or a regex -- I would imagine -- would be substr() and eq together.

    When imagine when you can Benchmark?

    use Benchmark 'cmpthese'; my $str1 = 'foobarbazqux'; my $str2 = 'foobar'; my $str2len = length $str2; cmpthese (3e6, { regex => sub { $str1 =~ /\Q$str2/; }, regex2 => sub { $str1 =~ /\Q$str2/o; }, ## probably useless test, +see japhy's comment below index => sub { index $str1, $str2; }, substr => sub { $str2 eq substr $str1, 0, length $str2; }, substr2 => sub { $str2 eq substr $str1, 0, $str2len; }, }); __END__ Rate regex regex2 substr substr2 index regex 1063830/s -- -13% -25% -43% -72% regex2 1229508/s 16% -- -14% -34% -68% substr 1421801/s 34% 16% -- -23% -63% substr2 1851852/s 74% 51% 30% -- -51% index 3797468/s 257% 209% 167% 105% --

    The index solution is clearly superior to the others. It's interesting to see the improvements of regex2 and substr2 over their unoptimized counterparts.

    Update: I forgot that the OP was matching at the beginning of the string. This is a better benchmark:

    use Benchmark 'cmpthese'; my $str1 = 'foobarbazqux'; my $str2 = 'foobar'; my $str2len = length $str2; cmpthese (3e6, { regex => sub { $str1 =~ /^\Q$str2/; }, index => sub { 0 == index $str1, $str2; }, substr => sub { $str2 eq substr $str1, 0, length $str2; }, substr2 => sub { $str2 eq substr $str1, 0, $str2len; }, }); __END__ Rate regex substr substr2 index regex 671141/s -- -59% -65% -71% substr 1630435/s 143% -- -16% -30% substr2 1935484/s 188% 19% -- -17% index 2343750/s 249% 44% 21% --

    --
    David Serrano

      I wouldn't bring /o into the equation. It makes a regex "sterile" which wouldn't be helpful if this were in a function where the strings are arguments.

      You haven't provided any data for failing cases. The further the smaller string is from the beginning of the bigger string, the slower index() will be.


      Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
      How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart