in reply to Pattern match array

You probably want to use word boundary anchors so that you don't get false positives with fprintf when looking for int etc. Also you might have more than one of your @prims on one line so a global match could be in order.

use strict; use warnings; my @prims = qw{ int char long double static }; my $rxPrims = do { local $" = q{|}; qr{\b(@prims)\b}; }; while ( <> ) { next unless my @found = m{$rxPrims}g; print qq{Found @found on line $.\n}; }

I hope this is useful.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^2: Pattern match array
by Narveson (Chaplain) on May 08, 2008 at 06:05 UTC
    You probably want to use word boundary anchors so that you don't get false positives with fprintf when looking for int etc.

    I agree. In other words, break the text into words before attempting to match entire words. So all the regex engine is doing is replicating an inner for-loop with an eq test.

    while ( <> ) { for my $word ( m{\b(\w+)\b}g ) { if (grep {$word eq $_} @prims) { print qq{Found $word on line $.\n}; } } }

    Update: which, of course, vindicates thezip's hash-based solution in the first response to this question.

      Letting the regex alternation do the heavy lifting seems to be a bit faster. Tested with a 1235 line C program cat'ed together 20 times.

      use strict; use warnings; use Benchmark q{cmpthese}; my @prims = qw{ int char long double static }; my $inFile = q{xxx.c}; open my $inFH, q{<}, $inFile or die qq{open: $inFile: $!\n}; my $outFile = q{/dev/null}; open my $outFH, q{>}, $outFile or die qq{open: $outFile: $!\n}; cmpthese( -10, { JohnGG => sub { seek $inFH, 0, 0; my $rxPrims = do { local $" = q{|}; qr{\b(@prims)\b}; }; while ( <$inFH> ) { next unless my @found = m{$rxPrims}g; print $outFH qq{Found @found on line $.\n}; } }, Narveson => sub { seek $inFH, 0, 0; while ( <$inFH> ) { for my $word ( m{\b(\w+)\b}g ) { if (grep {$word eq $_} @prims) { print $outFH qq{Found $word on line $.\n}; } } } }, } ); close $inFH or die qq{close: $inFile: $!\n}; close $outFH or die qq{close: $outFile: $!\n};

      The benchmark output.

      Rate Narveson JohnGG Narveson 1.39/s -- -63% JohnGG 3.78/s 173% --

      I hope this is of interest.

      Cheers,

      JohnGG

        Thanks, this is of interest.

        I think if I had known benchmarks would be run, I would have hashed instead of grepping.

        my %sought = map {$_ => 1} @prims;

        and later

        if ($sought{$word})

        And then there's List::MoreUtils::any, which would at least quit on the first match instead of checking the rest of the list.