in reply to Re: How can I get correct result in counting 3-letter words?
in thread How can I get correct result in counting 3-letter words?
In my benchmarks, substr is about twice as fast as a /..../g regex for getting the next X characters:
cail:~/work/perl/monks$ cat 966488.pl #!/usr/bin/env perl use Modern::Perl; use Benchmark qw(:all); my $string = ''; for (1..1_000_000){ # make a million-char string $string .= qw(A C G T)[rand(4)]; } cmpthese( 100, { 'regex' => \®ex, 'substring' => \&substring, }); sub substring { my $str = $string; my %h; while(length($str) % 3){ # snip to 3-letter boundary substr($str,-1, 1, ''); } while($_ = substr($str,0,3,'')){ $h{$_}++; } } sub regex { my $str = $string; my %h; for ($str =~ /.../g){ $h{$_}++; } } cail:~/work/perl/monks$ perl 966488.pl Rate regex substring regex 5.78/s -- -49% substring 11.4/s 97% --
Of course, if you only want to match certain letters, then you're back to a regex. But in that case, I might still try stripping out all the stuff I don't want with tr//, followed by substr to break it into pieces.
Aaron B.
My Woefully Neglected Blog, where I occasionally mention Perl.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: How can I get correct result in counting 3-letter words?
by ikegami (Patriarch) on Apr 23, 2012 at 01:45 UTC | |
by aaron_baugher (Curate) on Apr 23, 2012 at 02:25 UTC |