in reply to Alternative matches

If you have a lot of terms to search for, you can probably speed things up by creating a lowercase copy of the text and searching it case sensitive.

Replies are listed 'Best First'.
Re^2: Alternative matches
by TedPride (Priest) on Oct 06, 2004 at 09:49 UTC
    I tested this just to make sure, and searching a string of letters 25K long 1000 times, there was a savings of approx 2 seconds (21 vs 23) by making a lowercase copy. The same difference could also be seen searching 5K text 5000 times.
      Thanks, I shall remember that :)
      I tested this just to make sure . . . there was a savings of approx 2 seconds (21 vs 23)

      Usually we like to see Benchmark code to back up this type of assertion. Using a the setup from PodMaster's (i.e. generating a random $string to test against), I get the following results, which seem to validate your results:

      $ perl match Rate orig lc orig 1492/s -- -7% lc 1608/s 8% -- $ perl match Rate orig lc orig 1506/s -- -7% lc 1620/s 8% -- $ perl match Rate orig lc orig 1492/s -- -8% lc 1627/s 9% --

      And, in case you want to test this for yourself (or modify it to be more accurate to what you're doing), here's the code:

      use strict; use Benchmark qw(cmpthese); my $pattern = '(new|old|number|start|simple|cross|heavy|die|exit)'; my @words = qw( new Old Number start Simple Cross heavy die Exit ); my $string = join ' 0\4/f ', map( { rand $_ } 1 .. 60), map { $words[ rand @words ] } 1 .. 20; cmpthese (-2, { orig => sub { $string =~ /$pattern/i }, lc => sub { lc $string =~ /$pattern/ }, } );