Ah well, in a fit of boredom I decided to benchmark the submissions myself. The results are unsurprising (at least to me).
Granted the obiwan errors could just stem from counting the positions starting from 1 rather than 0. But when I'm asked to find the position of some character in a string, I try to make the value something that I could give to substr() and get the right answer. substr() uses 0-based positions and so do I.
Anyway, once I got them all conforming to the same interface, I wrote a little routine that would output a "ruler" for a given string and then tested all of the routines against visual inspection of the ruler and each other.
Once I was happy with that, I then wrote the benchmarks to compare looking for several occurences of a single character in a single string and then another set of benchmarks looking for a specific occurence of a single character in multiple strings.
Here's the program:
#!/usr/bin/perl use warnings; use strict; use Benchmark qw/:all/; sub salva { my($string,$char,$n) = @_; my $pat = "[^$char]*$char" x $n; my $where = -1; $string =~ /^($pat)/ and $where = length($1) - 1; return $where; } { my %hash; sub mickeyn { my ($string,$c,$n) = @_; return -1 if $n < 1; return $hash{$string}{$c}[$n-1] || -1 if exists $hash{$string}; my @split = split //, $string; my @positions = map { [$split[$_], $_] } 0 .. $#split; undef @split; foreach (@positions){ push @{$hash{$string}{$_->[0]}}, $_->[1]; } undef @positions ; return $hash{$string}{$c}[$n-1] || -1; } } sub RoyJohnson { my ($str,$char,$n) = @_; $str =~ /(?:.*?$char){$n}/g; return pos($str)-1 if defined pos($str); return -1; } sub find_nth { my ($s,$c,$n) = @_; my $pos = -1; while ($n--) { $pos = index($s,$c,$pos+1); return -1 if $pos == -1; } return $pos; } sub rule { my $s = shift; my $l = int(length($s)/10)+1; my $r1 = '0123456789' x $l; my $r2 = ' ' x $l; $r1 = substr($r1,0,length($s)); substr($r2,$_*10,1) = $_ for 1..length($s)/10; return "$r2\n$r1\n"; } sub test { my ($str,$c) = @_; print rule($str), "$str\n"; print "($c,$_): ", join(" ", find_nth($str,$c,$_), salva($str,$c,$_), mickeyn($str,$c,$_), RoyJohnson($str,$c,$_) ),"\n" for -1..10; } my $str="Now is the time for all good men to come to the aid of their +country"; my $n = 5; my $c = "o"; # test($str,$c); exit; my $time = shift || -5; cmpthese($time, { duff => sub { find_nth($str,$c,$_) for -1..10; }, salva => sub { salva($str,$c,$_) for -1..10; }, mickeyn => sub { mickeyn($str,$c,$_) for -1..10; }, RoyJohnson => sub { RoyJohnson($str,$c,$_) for -1..10; }, }); my @strings = ( "Now is the time for all good men to come to the aid of their country +", "Four score and seven years ago our forefathers brought forth unto th +is country", "How to write an efficient Perl function to return the position of th +e nth occurance of a specified character within a string?", "The rain in spain lies mainly upon the plain", "a" x 200, "a" x 200 . "ooo", "o" x 200, ); cmpthese($time, { duff => sub { find_nth($_,$c,3) for @strings; }, salva => sub { salva($_,$c,3) for @strings; }, mickeyn => sub { mickeyn($_,$c,3) for @strings; }, RoyJohnson => sub { RoyJohnson($_,$c,3) for @strings; }, });
notes: mickeyn's needed to be modified the most and I took a few liberties there so it's not exactly faithful to his original implementation in some respects.
Here are some results on my system; the first set of benchmarks are for multple occurences in one string, the second set is for a single occurence in multiple strings:
Rate RoyJohnson salva duff mickeyn
RoyJohnson 373/s -- -86% -97% -98%
salva 2604/s 598% -- -78% -88%
duff 11887/s 3089% 357% -- -44%
mickeyn 21159/s 5576% 713% 78% --
Rate RoyJohnson salva mickeyn duff
RoyJohnson 15272/s -- -9% -42% -48%
salva 16847/s 10% -- -36% -42%
mickeyn 26298/s 72% 56% -- -10%
duff 29202/s 91% 73% 11% --
In reply to Re^2: The nth occurrence of a character
by duff
in thread The nth occurrence of a character
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |