in reply to Re^3: Simple Matching
in thread Simple Matching

Thanks again, Putting the eval into Method 2 below is (much) slower than any other match (as noted by BrowserUK). Brute force (Method 3) is significantly faster than either 1 or 2, but does not let me loop through an array to check match counts. I have tried "grep -c" within perl and it works, but is not as fast as brute force. Is there a faster way? After 15+ years of Perl I have not needed to worry about cutting edge optimizing (in Perl) ... ergo my ignorance. Thanks for help again.
my $flush = 0; my @suits = ( "D", "S", "C", "H" ); my $long_hand = ("2H 3D 4C 5S 5H)"; # Method 1: 43 microseconds # (My initial code ... seemed slow ... ergo benchmark) foreach ( @suits ) { @m = ( $long_hand =~ m/$_/g ); $flush = 1 if ($#m eq 4 ); } # Method 2: 126 microseconds (definitely bottleneck) foreach ( @suits ) { my $count = ( eval "\$long_hand =~ tr/$_//" ); $flush = 1 if ($count eq 5 ); } #Method 3: accounts for only 3(!) microseconds of code # Ugly but functional for very few patterns to check # BUT what if I had 10k patterns to check $flush = 1 if ( $long_hand =~ tr/D// eq 5); $flush = 1 if ( $long_hand =~ tr/S// eq 5); $flush = 1 if ( $long_hand =~ tr/C// eq 5); $flush = 1 if ( $long_hand =~ tr/H// eq 5);

Replies are listed 'Best First'.
Re^5: Simple Matching
by AnomalousMonk (Archbishop) on Sep 28, 2009 at 12:42 UTC
    How does this compare speed-wise:
    >perl -wMstrict -le "my @suits = qw(D S C H); my $suits = join '', @suits; my $flush = qr{ ([$suits]) (?: [^$suits]+ \1){4} }xms; print $_, $_ =~ $flush ? ' flush' : ' nope' for '2H 3D 4C 5S 5H', '1H 2H 3H 4H 5H'; " 2H 3D 4C 5S 5H nope 1H 2H 3H 4H 5H flush
    Update: Might be faster if possessive:
    >perl -wMstrict -le "my @suits = qw(D S C H); my $suits = join '', @suits; my $flush = qr{ ([$suits]) (?: (?> [^$suits]+) \1){4} }xms; print $_, $_ =~ $flush ? ' flush' : ' nope' for '2H 3D 4C 5S 5H', '1H 2H 3H 4H 5H'; " 2H 3D 4C 5S 5H nope 1H 2H 3H 4H 5H flush
    5.10 has possessive modifier:
    >perl -wMstrict -le "my @suits = qw(D S C H); my $suits = join '', @suits; my $flush = qr{ ([$suits]) (?: [^$suits]++ \1){4} }xms; print $_, $_ =~ $flush ? ' flush' : ' nope' for '2H 3D 4C 5S 5H', '1H 2H 3H 4H 5H'; " 2H 3D 4C 5S 5H nope 1H 2H 3H 4H 5H flush
    Bugfix: Changed  (?> [^$suits])+ to  (?> [^$suits]+) per ikegami Re^6: Simple Matching.
      (?>[^$suits]) is the same as [^$suits]. The + should be inside the (?>).

      Captures are slow, so it would be faster to build a regex without captures I also generalised some of the matching. Input validation is not a goal here.

      m{ ^ . (?: D (?: . .D ){4} | S (?: . .S ){4} | C (?: . .C ){4} | H (?: . .H ){4} ) \z }xs;

      Or maybe even

      m{ ^ . (?: D..D..D..D..D | S..S..S..S..S | C..C..C..C..C | H..H..H..H..H ) \z }xs;
      built it manually here, but it can be built dynamically.

      Update: That last regex pattern gives me an idea:

      my $masked = $_ & "\x00\xFF \x00\xFF \x00\xFF \x00\xFF \x00\xFF"; ( $masked eq "\x00D \x00D \x00D \x00D \x00D" || $masked eq "\x00S \x00S \x00S \x00S \x00S" || $masked eq "\x00C \x00C \x00C \x00C \x00C" || $masked eq "\x00H \x00H \x00H \x00H \x00H" )

      Of course, if the input format was a number, it would be even faster:

      sub build { return ( $_[0] << (16*4 + 2*5 ) ) + ( $_[1] << (16*4 + 2*4 ) ) + ( $_[2] << (16*3 + 2*4 ) ) + ( $_[3] << (16*3 + 2*3 ) ) + ( $_[4] << (16*2 + 2*3 ) ) + ( $_[5] << (16*2 + 2*2 ) ) + ( $_[6] << (16*1 + 2*2 ) ) + ( $_[7] << (16*1 + 2*1 ) ) + ( $_[8] << (16*0 + 2*1 ) ) + ( $_[9] << (16*0 + 2*0 ) ); } use constant { SUITS_MASK => build(0,3, 0,3, 0,3, 0,3, 0,3), ALL_D => build(0,0, 0,0, 0,0, 0,0, 0,0), ALL_S => build(0,1, 0,1, 0,1, 0,1, 0,1), ALL_C => build(0,2, 0,2, 0,2, 0,2, 0,2), ALL_H => build(0,3, 0,3, 0,3, 0,3, 0,3), }; my $suits = $_ & SUITS_MASK; ( $suits == ALL_D || $suits == ALL_S || $suits == ALL_C || $suits == ALL_H )
Re^5: Simple Matching
by BrowserUk (Patriarch) on Sep 29, 2009 at 00:54 UTC

    It is a simple fact of life that if you can hard code a lookup table, it will be faster than building it on-the-fly, or choosing one conditionally.

    If you need best possible speed--and that is a big if--then the choice to store 10,000 lookup tables is a speed -v- memory trade-off that you might choose to make.

    If youreally need absolute best speed, then C or assembler are your only choices. But the surprising thing about Perl--the redeeming feature for all it faults--is that it rarely make you make that choice. Beyond the realms of either highly algorithmically complex; or no-alternative, cpu intensive--Perl's quirks, irregularities and non-orthogonality's are tailored--by experience and practice, over orthodoxy and dogma--to produce best possible results in best possible time.....

    with the proviso--and it is a big one--that you are not afraid to consider perfectionism, passe; correctness. contrived; elitism, irrelevant; and practicality paramount. Beyond the rarefied atmospheres of academic and dogmatic perfection, code that works for the common case. today; is far preferable to either perfection tomorrow.

    Strive for perfection; but don't exclude a solution today, for perfection tomorrow. Tomorrow rarely (I'm pragmatic; "never" is not a part of my vocabulary :) ever comes.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.