in reply to Re^2: Matching First Character of Strings Efficiently (benchmarking)
in thread Matching First Character of Strings Efficiently

Rate ... hb1 46.5/s ... ... tye2 181/s ...
But I still consider such nano-optimization games to be more a waste than of value.

Since when is a 4-to-1 speed-up "nano"? Granted, in the OP's case, maybe this particular 4:1 ratio would end up affecting only 10% of the overall run-time, once "expensive_process" (whatever it is) plays its role. Also, I wouldn't really want to bust my butt running an exhaustive test of 12 alternatives -- just compare the first (or current) idea against the next idea that looks like it could do better, in case the current idea seems to be taking longer than I like.

But given that this sort of vetting task is likely to come up, if I had to do it over an input list of, say, 2 million items, I'd much prefer that it take 3 hours rather than 12.

update: Having seen the replies to my post from tye and Aristotle, I have to say (sheepishly) thanks for the wake-up -- I should have known better than to take this sort of benchmark tabulation at "face value". In fact, I have used Benchmark myself on other problems -- but I opted for the "timethese" method (not "cmpthese"), which offers the advantage of showing the overall wall-clock difference as well as the "relative proportional" time difference for each approach. In one attempt, a comparison of two approaches showed "5.44/s" for one and "94.34/s" for the other -- like wow, huh? -- but the wall-clock difference was 2 sec vs. 9 sec... go figure. If an app really needs to be optimized, the first tool of choice should be profiling, of course, to see what segment of the code is eating all the time, so you can rethink that part first.

  • Comment on Re: Re^2: Matching First Character of Strings Efficiently (benchmarking)
  • Download Code

Replies are listed 'Best First'.
Re^4: Matching First Character of Strings Efficiently (4 times!)
by tye (Sage) on Mar 16, 2004 at 02:21 UTC
    Since when is a 4-to-1 speed-up "nano"?

    It is called micro- or nano-optimization because the operation being optimized is so small that it is likely to account for closer to 0.1% or less of the total run time (instead of the 10% you guessed) of most non-trivial scripts, not because Benchmark.pm measures a small percentage difference.

    Even if it accounted for 10% of the run time (extremely unlikely for an operation as small as comparing two characters), a 4-to-1 speed-up means only a 7.5% total speed-up. In reality, you are more likely to get something like a 0.075% speed up.

    I'd much prefer that it take 3 hours rather than 12

    See, this is why this cult is so seductive. No matter how many times someone explains why this type of stuff is almost never very useful, people see "Four times faster!" and start drooling. Using your own (highly optimistic) numbers, a 12-hour run would take 11.1 hours (not 3). I mention this not because I think 0.9 hours is a short time (because the reality will probably be less than even a few minutes saved for a 12-hour run) but to show how "Four times faster!" takes over one's thoughts and one jumps to invalid conclusions.

    Even if you had a script that did (well, tried to do) nothing but compare lots of leading characters of strings, you'd probably find that the overhead would end up adding up to a lot more than the comparison time. Benchmark.pm does a lot of work to estimate the overhead and subtract it.

    I also think noone would try the slowest methods benchmarked here except in a rather desperate attempt to nano-optimize, so even the 4-to-1 speed-up isn't real. It is a near-to-4-to-1 slow-down of a desperate optimization attempt.

    - tye        

Re^4: Matching First Character of Strings Efficiently (benchmarking)
by Aristotle (Chancellor) on Mar 16, 2004 at 03:21 UTC

    Very well, now look at the other results. The solution both I and kvale proposed is the most straightforward, arguably most maintainable, and least "creative" one. It comes in third, much faster than the slowest, very "clever" solution, and achieves over 80% of the speed of tye's winning entry. Do you see anything wrong with your picture yet?

    The reality is that "boring" solutions chosen with a clear understanding of the task at hand are usually plenty fast enough. In my experience, they tend to achieve close to 85% of the possible performance.

    Given the points already made by tye about the reality of the impact of microoptimizations, there is very little worth in evaluating the obscure. Go with the maintainable, clear, concise solution and worry only if performance is actually insufficient.

    And even if you have gotten there, it is still more worthwhile to optimize your expensive_functions instead.

    Makeshifts last the longest.