Re^9: Speed Improvement (Is 311 times faster: "insignificant"?)

I was a bit surprised that I didn't see anybody benchmark the original code

I did. If you look back at my first post, you'll see nar_func listed in the benchmark results.

So I'll take a wild guess ...

Why guess. Do some work.

It should be fairly simple to show ...

Go for it.

on strings similar to those actually specified in the root node.

Perhaps those strings are ... oh, I don't know ... just short, non-typical examples. You know, in keeping with the typical requests for "some short examples to demonstrate the problem".

The bottom line is that the scope for optimisation is clearly shown in the OPs code:

sub nar_substitute {
    my @numeric_chars = ( 0 .. 9 );
    my $message = shift;
    my @numeric_matches = ($message =~ m/\{\\d\d+\}/g);
    foreach (@numeric_matches) {
        my $substitution = $_;
        my ($substitution_count) = ($substitution =~ m/(\d+)/);
        my $number = '';
        for (1..$substitution_count) {
            $number .= $numeric_chars[int rand @numeric_chars];;
        }
        $message =~ s[\Q$substitution][$number]e;
    }
    return $message;
}
[download]

Two, nested, perl-level loops; three calls into the regex engine per substitution; and N-calls to rand for each N-digit substitution.

All of which can and was replaced by a single, C-level loop (/g); a single call to the regex engine per file; and a single call to rand per substitution.

'Tain't rocket science.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re^9: Speed Improvement (Is 311 times faster: "insignificant"?) Select or Download Code