in reply to Benchmarking "Are all these characters in this sentence?"

unpack()ing the letters, instead of split()ting them, may boost the performance slightly:
sub RMGir_index { my ($sentence, $wantedLetters)=@_; ## comment removed ## my $foundLetters=scalar (grep index($sentence,$_)>=$[, unpack "(a)*", $wantedLetters); return length($wantedLetters)==$foundLetters; }

Replies are listed 'Best First'.
Re^2: Benchmarking "Are all these characters in this sentence?" (/./gs)
by tye (Sage) on Aug 30, 2008 at 15:49 UTC

    Rather than change the tiny operation used to generate a list of letters, just avoid generating a list of letters at all:

    sub tye2 { my( $sentence, $wantedLetters )= @_; while( $wantedLetters =~ /(.)/gs ) { return 0 if -1 == index($sentence,$1); } return 1; }

    If you want to go for ugly code for the sake of micro-optimizations, then

    sub tye1 { my( $sentence, $wantedLetters )= @_; -1 == index($sentence,$1) && return 0 while( $wantedLetters =~ /(.)/gs ); return 1; }

    Or get even uglier to the point of risking improper behavior in some cases:

    sub tye0 { -1 == index( $_[0], $1 ) && return 0 while( $_[1] =~ /(.)/gs ); return 1; }

    - tye        

      Very nice!

      Except for the "VeryLong" test case, one of those wins all the other benchmarks. For the short charset cases, tye0 wins, while for the others, tye1 and tye2 are tied :) and slightly better than tye0.

      And on the VeryLong case, repellent's unpack win, but the 2nd place results look like a wash between several of the approaches, and yours are quite competitive.

      I've updated the parent node with your subs, thanks!


      Mike
Re^2: Benchmarking "Are all these characters in this sentence?"
by RMGir (Prior) on Aug 30, 2008 at 14:42 UTC
    repellent++ !!

    I never thought of that.

    Happily, JUST before I saw that you posted that, I posted a new version of the code that makes adding variants easy, so I added your variant, as well as variants using unpack + List::MoreUtils, including a new variant from tassilo.

    Your unpack approach wins!! Thanks for pointing that option out...

    The parent post is updated with the new code and the new results.


    Mike