in reply to Re^6: Speed Improvement (insignificant)
in thread Speed Improvement

I will be genuinely shocked if somebody produces something useful that uses this operation where the reported up-to 300% performance improvement actually leads to a noticeable change in total script run-time (where "noticeable" means "over 20%").

The lines being modified are obviously some kind of templating system. For what? We can only guess, but perhaps some kind of scientific Monte Carlo simulation runs... At some point, the templated random values have to be populated with actual numbers.

To that end, I generated 100 files averaging 500k using this code:

Not exotic, but 'good enough'.

I then used the following code to slurp the files in turn, do the substitutions and then write the modified data to new files in another directory. Once using Nar's OP code (tweaked to work) and once using my posted version:

#! perl -slw use strict; use Time::HiRes qw[ time ]; sub nar_substitute { my @numeric_chars = ( 0 .. 9 ); my $message = shift; my @numeric_matches = ($message =~ m/\{\\d\d+\}/g); foreach (@numeric_matches) { my $substitution = $_; my ($substitution_count) = ($substitution =~ m/(\d+)/); my $number = ''; for (1..$substitution_count) { $number .= $numeric_chars[int rand @numeric_chars];; } $message =~ s[\Q$substitution][$number]e; } return $message; } sub buk_substitute{ my $s = shift; $s =~ s[\{\\d(\d+)\}][ substr int( 1e10 + rand 1e10 ), 1, $1 ]ge; return $s } our $O //= 0; $|++; my $start = time; for my $fname ( glob 'templ*.txt' ) { printf "\rProcessing $fname"; my $file = do{ local( @ARGV, $/ ) = $fname; <> }; $file = $O ? buk_substitute( $file ) : nar_substitute( $file ); open O, '>', "modified/$fname" or die $!; print O $file; close O; } printf "\n\nTook %.6f secs\n", time() - $start;
[16:39:37.79] C:\test\junk>..\junk63 Processing templ99.txt Took 2064.037575 secs [17:16:13.24] C:\test\junk>del modified\* C:\test\junk\modified\*, Are you sure (Y/N)? y [17:25:20.32] C:\test\junk>..\junk63 -O=1 Processing templ99.txt Took 6.626883 secs [17:25:35.03] C:\test\junk>

100 - ( 6.626883 / 2064.037575 * 100 ) = 99.7% saving or 311 times faster!

Worth the effort I would say.

Maybe this needs doing once per run of the simulation. Maybe once a day; maybe hundreds. Maybe 100 files is overkill; maybe it requires thousands of files. Maybe 500k average is oversized; maybe they a in the GB range. I don't know...and neither do you.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^8: Speed Improvement (Is 311 times faster: "insignificant"?)
by tye (Sage) on Dec 03, 2014 at 20:00 UTC

    Note that it is not the case that a 300% speed increase on a micro operation resulted in an over-all speed-up of 31100%. This demonstration is impressive but, to be clear, is also rather tangential to what I was actually talking about.

    This is a decisive refutation, but of a straw-man argument that I wasn't actually making. It resoundingly defends the idea that optimization can sometimes be quite significant. But I was not attacking that idea (and I never have). (Though, I certainly allow that BrowserUk might have sincerely believed I was doing so.)

    It might be interesting to reuse the code to test my actual prediction.

    I was a bit surprised that I didn't see anybody benchmark the original code (but perhaps I just missed it). Significant performance degradation on this type of stuff is usually due to regexes making entire copies of very long strings (especially if applying a large number of regexes per string) or due to pathological backtracking. The latter seems unlikely after a quick scan or the original code. So I'll take a wild guess (without having even glanced at the code hidden above) and predict that the disconnect has much much more to do with the size of strings being used and rather little to do with the original implementation actually being even close to 31100% slower on strings similar to those actually specified in the root node.

    It should be fairly simple to show how the gulf between my "<20%" prediction and the above "31100%" demonstration is distributed over the various disconnects:

    • Inaccuracy of my prediction
    • Using the original implementation not one of the Benchmark'ed ones
    • Using much, much longer strings
    • Applying many more regexes per string than in the Benchmarks
    • Pathological backtracking
    • Something else?

    Perhaps something informative, even useful, could result. It might even provide practical lessons for some on a few differences between micro optimizations and useful optimizations.

    - tye        

      I was a bit surprised that I didn't see anybody benchmark the original code

      I did. If you look back at my first post, you'll see nar_func listed in the benchmark results.

      So I'll take a wild guess ...

      Why guess. Do some work.

      It should be fairly simple to show ...

      Go for it.

      on strings similar to those actually specified in the root node.

      Perhaps those strings are ... oh, I don't know ... just short, non-typical examples. You know, in keeping with the typical requests for "some short examples to demonstrate the problem".

      The bottom line is that the scope for optimisation is clearly shown in the OPs code:

      sub nar_substitute { my @numeric_chars = ( 0 .. 9 ); my $message = shift; my @numeric_matches = ($message =~ m/\{\\d\d+\}/g); foreach (@numeric_matches) { my $substitution = $_; my ($substitution_count) = ($substitution =~ m/(\d+)/); my $number = ''; for (1..$substitution_count) { $number .= $numeric_chars[int rand @numeric_chars];; } $message =~ s[\Q$substitution][$number]e; } return $message; }

      Two, nested, perl-level loops; three calls into the regex engine per substitution; and N-calls to rand for each N-digit substitution.

      All of which can and was replaced by a single, C-level loop (/g); a single call to the regex engine per file; and a single call to rand per substitution.

      'Tain't rocket science.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.