narainhereHi monks, I would like to do a regex on a string, to see if it matches any of the strings in a array

After reading the solutions here, I was tempted to
post a second contribution (which I do now).

The topic is interesting because so many
ways exist to solve it. My question was then:
what are they worth in real code. Stressing the
point of the OP: "to see if it matches any of the strings in a array",
it's clear that the grep/map-solutions do too much here.
But what about the regex-only solution?

Here's my first attempt to get behind that:
use strict; use warnings; use List::Util qw'first reduce'; use Benchmark qw'cmpthese timethese'; my @arr = qw' cool guy here ' x 1; my $str = '100 WORDS ' x 50 . 'I am cool'; my @invocation=(0)x5; my $results = timethese(0, { 'word_altn' => sub { local $" = '|'; if( $str =~ /@arr/ ) { ++$invocation[0]; # print "Matched\n" } }, 'block_grep' => sub { if( grep { index($str, $_) !=-1 } @arr) { ++$invocation[1]; # print "Matched\n" } }, 'expr_grep' => sub { if( grep index($str, $_) !=-1, @arr) { ++$invocation[2]; # print "Matched\n" } }, 'list_util_index' => sub { if( first { index($str, $_) != -1 }, @arr ) { ++$invocation[3]; #print "Matched\n" } }, 'list_util_regex' => sub { if( first { $str =~ /$_/ }, @arr ) { ++$invocation[4]; #print "Matched\n" } } }, 'none' ); print "@invocation\n"; die unless 5 == grep $_>0, @invocation; cmpthese $results;
On my Linux box (5.8.8), this will produce:
42598 1157019 1349175 990717 1135313 Rate word_altn list_util_index list_util_regex blo +ck_grep expr_grep word_altn 10213/s -- -96% -97% + -97% -97% list_util_index 274237/s 2585% -- -8% + -16% -20% list_util_regex 299705/s 2835% 9% -- + -8% -13% block_grep 325087/s 3083% 19% 8% + -- -5% expr_grep 342754/s 3256% 25% 14% + 5% --
On larger strings *or* larger comparison word arrays,
the List::Util based solutions will tend to win
by a margin.

Regards

mwa

In reply to Re: Mathching an array in regex by mwah
in thread Mathching an array in regex by narainhere

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.