Ah well, in a fit of boredom I decided to benchmark the submissions myself. The results are unsurprising (at least to me).

The first thing I did was put everyone's code in subroutines that all conformed to the same interface. I quickly discovered that no one tested their code :-). salva has a precedence problem, mickeyn has an off-by-one problem, and Roy Johnson has an off-by-N problem where N happens to be 1 in the case that you're looking for a single character.

Granted the obiwan errors could just stem from counting the positions starting from 1 rather than 0. But when I'm asked to find the position of some character in a string, I try to make the value something that I could give to substr() and get the right answer. substr() uses 0-based positions and so do I.

Anyway, once I got them all conforming to the same interface, I wrote a little routine that would output a "ruler" for a given string and then tested all of the routines against visual inspection of the ruler and each other.

Once I was happy with that, I then wrote the benchmarks to compare looking for several occurences of a single character in a single string and then another set of benchmarks looking for a specific occurence of a single character in multiple strings.

Here's the program:

#!/usr/bin/perl use warnings; use strict; use Benchmark qw/:all/; sub salva { my($string,$char,$n) = @_; my $pat = "[^$char]*$char" x $n; my $where = -1; $string =~ /^($pat)/ and $where = length($1) - 1; return $where; } { my %hash; sub mickeyn { my ($string,$c,$n) = @_; return -1 if $n < 1; return $hash{$string}{$c}[$n-1] || -1 if exists $hash{$string}; my @split = split //, $string; my @positions = map { [$split[$_], $_] } 0 .. $#split; undef @split; foreach (@positions){ push @{$hash{$string}{$_->[0]}}, $_->[1]; } undef @positions ; return $hash{$string}{$c}[$n-1] || -1; } } sub RoyJohnson { my ($str,$char,$n) = @_; $str =~ /(?:.*?$char){$n}/g; return pos($str)-1 if defined pos($str); return -1; } sub find_nth { my ($s,$c,$n) = @_; my $pos = -1; while ($n--) { $pos = index($s,$c,$pos+1); return -1 if $pos == -1; } return $pos; } sub rule { my $s = shift; my $l = int(length($s)/10)+1; my $r1 = '0123456789' x $l; my $r2 = ' ' x $l; $r1 = substr($r1,0,length($s)); substr($r2,$_*10,1) = $_ for 1..length($s)/10; return "$r2\n$r1\n"; } sub test { my ($str,$c) = @_; print rule($str), "$str\n"; print "($c,$_): ", join(" ", find_nth($str,$c,$_), salva($str,$c,$_), mickeyn($str,$c,$_), RoyJohnson($str,$c,$_) ),"\n" for -1..10; } my $str="Now is the time for all good men to come to the aid of their +country"; my $n = 5; my $c = "o"; # test($str,$c); exit; my $time = shift || -5; cmpthese($time, { duff => sub { find_nth($str,$c,$_) for -1..10; }, salva => sub { salva($str,$c,$_) for -1..10; }, mickeyn => sub { mickeyn($str,$c,$_) for -1..10; }, RoyJohnson => sub { RoyJohnson($str,$c,$_) for -1..10; }, }); my @strings = ( "Now is the time for all good men to come to the aid of their country +", "Four score and seven years ago our forefathers brought forth unto th +is country", "How to write an efficient Perl function to return the position of th +e nth occurance of a specified character within a string?", "The rain in spain lies mainly upon the plain", "a" x 200, "a" x 200 . "ooo", "o" x 200, ); cmpthese($time, { duff => sub { find_nth($_,$c,3) for @strings; }, salva => sub { salva($_,$c,3) for @strings; }, mickeyn => sub { mickeyn($_,$c,3) for @strings; }, RoyJohnson => sub { RoyJohnson($_,$c,3) for @strings; }, });

notes: mickeyn's needed to be modified the most and I took a few liberties there so it's not exactly faithful to his original implementation in some respects.

Here are some results on my system; the first set of benchmarks are for multple occurences in one string, the second set is for a single occurence in multiple strings:

              Rate RoyJohnson      salva       duff    mickeyn
RoyJohnson   373/s         --       -86%       -97%       -98%
salva       2604/s       598%         --       -78%       -88%
duff       11887/s      3089%       357%         --       -44%
mickeyn    21159/s      5576%       713%        78%         --
              Rate RoyJohnson      salva    mickeyn       duff
RoyJohnson 15272/s         --        -9%       -42%       -48%
salva      16847/s        10%         --       -36%       -42%
mickeyn    26298/s        72%        56%         --       -10%
duff       29202/s        91%        73%        11%         --

In reply to Re^2: The nth occurrence of a character by duff
in thread The nth occurrence of a character by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.