in reply to Question about speeding a regexp count

I need to count the number of occurences of every possible 1, 2, and 3 letter combination in this sequence.

I'd use substr() and build up a hash as I made one pass through the string.

how I can use a variable in a tr// regexp?

You must use eval.

Update: Code for your first question:

my $length = length $string; my %seen; for my $i (0 .. $length - 1) { $seen{ substr($string, $i, 1) } ++; $seen{ substr($string, $i, 2) } ++ if $i < $length - 1; $seen{ substr($string, $i, 3) } ++ if $i < $length - 2; }
Assumes your data is in $string.

-sauoq
"My two cents aren't worth a dime.";

Replies are listed 'Best First'.
Re^2: Question about speeding a regexp count
by ikegami (Patriarch) on Oct 13, 2005 at 19:40 UTC

    This is probably faster:

    my %count; my $length = length $seq; for my $i (0 .. $length - 3) { $count{ substr($seq, $i, 1) }++; $count{ substr($seq, $i, 2) }++; $count{ substr($seq, $i, 3) }++; } $count{ substr($seq, $length - 1, 1) }++; $count{ substr($seq, $length - 2, 1) }++; $count{ substr($seq, $length - 2, 2) }++;

    Renamed %seen to the more appropriate %count. Renamed $string to $seq to match the OP.

      This is probably faster:

      Don't forget "less maintainable."

      How many lines do you have to add or change when someone comes around next week and asks you to look at substring lengths up to 10 characters long?

      -sauoq
      "My two cents aren't worth a dime.";
      
        True. Optimization vs Readability/Maintability is always an issue. Only the OP is qualified to answer whether he's willing to sacrifice Readability/Maintability for a 35% increase in speed. (See my benchmarks below.)