Computers can do so many operations per second, so fast, that sometimes are considered omnipotent. But, sometimes an approach a little too direct or dumb can transform our fastest machine in a cart ...

This is not the discovery of the century, I know, and every good programmer pays attention to the issue, but sometimes I stop to think how is easy to turn a good-performing program in a disaster.

Just yesterday, I had to check about 10,000 files in a directory from a database of about 20,000 records: the iterating solution a-query-by-file it take an hour; by loading the entire table in a hash and then looking information it take few seconds.
So far, not the sense of life, but a thing on which I like to meditate.

Replies are listed 'Best First'.
Re: quickness is not so obvious
by roboticus (Chancellor) on Jan 23, 2015 at 13:14 UTC

    DanBev:

    If I wrote some code and found that it ran much slower than I expected, I first ask myself "What the heck did I do wrong?". I'll then look at the code to see if I did something dumb.

    If the code and algorithm look OK, I'll normally then check my assumptions and write a simple program to "look at" the data, but not do any processing, just to see what the fastest possible time could be. (So in this case, I'd have it simply open and read the 10,000 files, but not really look at the contents or do any writing.) Perhaps getting the data really does take that long. (It may be trawling through a remote filesystem with significant network latency, perhaps some of the data is archived on a tape system and it takes time for the robot to swap tapes to make the data available, etc.

    So if there's little time difference in simply reading the data and doing the processing, I'll ask the operations team about improving performance. If the difference is *huge*, then it's time to profile your code and find out where it's spending all its time. Sometimes you'll find yourself doing something silly (oops! I'm opening and scanning the contents of file X inside of a loop where I'm processing file Y line-by-line). Sometimes you might find a regex that's performing poorly for some odd data. Or you may find that the algorithm you used is even less performant than expected, and you need to use/invent a better one.

    As an example of the latter, I was surprised a few years ago when I found that the recursive method for computing fibonacci numbers was as slow as it is:

    #!/usr/bin/env perl use strict; use warnings; use Time::HiRes qw( gettimeofday tv_interval ); for my $i (1 .. 100) { my $start = [gettimeofday]; my $fibonacci_number = fibo($i); my $interval = tv_interval( $start, [gettimeofday]); print "fib($i)=$fibonacci_number ($interval sec)\n"; } sub fibo { my $num = shift; return 1 if $num<3; return fibo($num-1) + fibo($num-2); }

    I knew that this method wasn't terribly efficient, but I was still amazed at how slow it actually was:

    $ perl fibo.pl fib(1)=1 (6e-06 sec) fib(2)=1 (4e-06 sec) fib(3)=2 (7e-06 sec) fib(4)=3 (6e-06 sec) fib(5)=5 (6e-06 sec) fib(6)=8 (1e-05 sec) fib(7)=13 (1.6e-05 sec) fib(8)=21 (2.4e-05 sec) fib(9)=34 (3.5e-05 sec) fib(10)=55 (5.6e-05 sec) fib(11)=89 (8.9e-05 sec) fib(12)=144 (0.000145 sec) fib(13)=233 (0.000291 sec) fib(14)=377 (0.000379 sec) fib(15)=610 (0.000609 sec) fib(16)=987 (0.000981 sec) fib(17)=1597 (0.001583 sec) fib(18)=2584 (0.002644 sec) fib(19)=4181 (0.00414 sec) fib(20)=6765 (0.005748 sec) fib(21)=10946 (0.004466 sec) fib(22)=17711 (0.007214 sec) fib(23)=28657 (0.011702 sec) fib(24)=46368 (0.018114 sec) fib(25)=75025 (0.027751 sec) fib(26)=121393 (0.045355 sec) fib(27)=196418 (0.072849 sec) fib(28)=317811 (0.117512 sec) fib(29)=514229 (0.193195 sec) fib(30)=832040 (0.307089 sec) fib(31)=1346269 (0.505429 sec) fib(32)=2178309 (0.796903 sec) fib(33)=3524578 (1.312573 sec) fib(34)=5702887 (2.093858 sec) fib(35)=9227465 (3.422112 sec) fib(36)=14930352 (5.594139 sec) fib(37)=24157817 (8.969398 sec) fib(38)=39088169 (14.46981 sec) fib(39)=63245986 (23.423682 sec) fib(40)=102334155 (38.201329 sec) fib(41)=165580141 (62.30559 sec) ^C

    Yeah, like I'm going to sit through it computing the first 100 numbers that way. That's when I looked up Memoize and started using it. Which, by the way, is a very nice way to get a performance boost for some programs with little modification. I simply added the following two lines to the top of the program (just after the other 'use' statements):

    use Memoize; memoize('fibo');

    To get much better performance:

    $ perl fibo.pl fib(1)=1 (8e-06 sec) fib(2)=1 (5e-06 sec) fib(3)=2 (1.2e-05 sec) fib(4)=3 (7e-06 sec) fib(5)=5 (6e-06 sec) fib(6)=8 (6e-06 sec) fib(7)=13 (6e-06 sec) fib(8)=21 (6e-06 sec) fib(9)=34 (5e-06 sec) fib(10)=55 (6e-06 sec) . . . <snip> . . . fib(98)=1.35301852344707e+20 (6e-06 sec) fib(99)=2.18922995834555e+20 (6e-06 sec) fib(100)=3.54224848179262e+20 (6e-06 sec)

    Problem solved--Yeah, when you trawl through a deep binary tree *twice*for*each*level*, it's gonna get slow....*very*very*slow*. Lesson learned, moving on to next thing....

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      As a side note:

      Fibonacci is one of these theoretical problems which need large computation where implementation matters.

      But in real life you can approximate the result with a certain accuracy.

      So why should I need more than 10 digits of fib (100) if I ever need fib(100) ?

      Cheers Rolf

      PS: Je suis Charlie!

      Thanks a lot for your reply.
      Without going into too much detail about algorithms or optimizers, the reflection was precisely about the subtle world to do things, do things right and make things better.

      In detail, in my case, from a list of files (I did not need to open them) what occupy time were obviously the queries to a database on another server.
      Cheer

      Hmm, my personal feeling is that if you use Memoize and see a dramatic performance boost when using recursive functions like this, then you really need to rethink your algorithm as there is very likely a much better solution

      (To be fair, I suspect this is not always true, however it is absolutely worth investigating, especially if the trade off in memory utilisation is particularly expensive)

        > then you really need to rethink your algorithm as there is very likely a much better solution

        I doubt this, I've done some branch-and-bound implementations in the past which searched recursively through a graph and it was crucial for speed to see if a partial solution was already calculated.

        Some graph search algorithms avoid memoizing by ordering the input nodes (thus avoiding duplications) In aforementioned cases this wasn't possible at all!

        Cheers Rolf

        PS: Je suis Charlie!

        update

        for instance see wikipedia Dynamic_programming

        The dynamic programming approach seeks to solve each subproblem only once, thus reducing the number of computations: once the solution to a given subproblem has been computed, it is stored or "memo-ized": the next time the same solution is needed, it is simply looked up. This approach is especially useful when the number of repeating subproblems grows exponentially as a function of the size of the input.

        How do you justify spending even five minutes thinking about how to improve the algorithm when you can solve the issue in 30 seconds and two lines of code? Of course after you've sent 30 seconds and found that the problem isn't solved, then it's time to think about smarter ways to approach the problem.

        I know where you are coming from. Tricks like Memoize feel like cheating. But if it gets the job done and keeps getting it done then it's probably the right solution.

        Perl is the programming world's equivalent of English
      You could also have cached the calculated data yourself. Something like this several-liner:
      $ time perl -e ' print fibo(99); { my @cache; sub fibo { my $num = shift; unless (defined $cache[$num]) { if ($num < 3) { $cache[$num] = 1; } else { $cache[$num] = fibo($num-1) + fibo($num-2); } } return $cache[$num]; } } ' 2.18922995834555e+20 real 0m0.036s user 0m0.000s sys 0m0.031s

      Je suis Charlie.

        Laurent_R:

        Yeah, I know. I used the recursive method on purpose just 'cause it was easy and I was demonstrating recursion to a beginner. (If I recall correctly, I used the fibonacci series because it was even uglier than the traditional factorial example of recursion.) If I was actually interested in the fibonacci numbers (I don't recall actually ever needing them for anything), I'd've just created them in a loop, something like:

        my @vals = (1, 1); . . . sub fibo { my $num = (shift)-1; while ($#vals < $num) { push @vals, $vals[-1] + $vals[-2]; } return $vals[$num]; }

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

Re: quickness is not so obvious
by locked_user sundialsvc4 (Abbot) on Jan 22, 2015 at 15:54 UTC
    Don’t “diddle” code to make it faster.   Find a better algorithm.

    Kernighan & Plauger:   The Elements of Programming Style

    Proof-positive that great wisdom sometimes comes in tiny books.

      Sometimes, implementation details do make a difference. Even with today's advanced optimizing compilers. And in the embedded systems world, where US$ 0.50 processors rule, the best available compilers often do not have the most advanced optimizers.

      Example (in C):

      i = count; j = 0; do { // something with j j++; } while (--i);

      is often faster than the common: for (j = 0; j < count; j++) { } or even: for (i = count, j = 0; i; --i, j++) { }

      This is because it's easy for an optimizer to "see" it can safely "replace" while (--i) with a decrement-and-branch-if-not-zero instruction.

      (Also, if your code functions equally correctly with a decrementing "index", you can skip the extra variable and associated increment.)

        If you are worried about dumb compilers why do you use post increment operators where the code logic doesn't care if it's a pre or post operator? Post "requires" an intermediate variable to save the pre-incremented value so it can be returned post increment.

        This preference for post increment is pervasive and I can't think why that should be. To me pre-increment is king because it puts the operator out front where I can see it instead of hiding the operator behind a variable. I expect most modern compilers would optimise the code if the "return" value's not used in any case, but in an embedded context where the odd nano-second may be important using post-increment by default seems really odd if you don't trust the compiler.

        Perl is the programming world's equivalent of English

      Echoing what RonW said, I think that particular reasoning is from a pre-high-level language mindset where there aren’t all that many ways to write a piece of code and they will often collapse to a similar number of ops when compiled. There are so many ways to write a line or block or package of Perl that diddling is harder to define… but you should know it when you see it. Apologies to the ghost of Justice Stewart.

Re: quickness is not so obvious
by Laurent_R (Canon) on Jan 24, 2015 at 22:55 UTC
    I fully agree with your post. And I am quite often pis*ed off by posts here in the Monastery in which, facing a performance problem, some monk says: "Use a database". Wrong answer most of the time. A hash can be hundreds or even thousands of times faster than a DB.

    I am dealing almost daily with various databases. The first thing I do, quite often, is to load a number of tables (especially the reference data) into hashes.

    Je suis Charlie.