in reply to Optimizations and Efficiency

The first piece of advice I have is to read what Code Complete has to say on optimization.

The second piece of advice is to focus on structuring your code well rather than optimization. If you have organized your code well, then you are likely to wind up with more time to optimize, and your optimization effort can take advantage of the fact that you can profile your bottlenecks before you optimize. Good code pays off.

The third bit of advice is that Perl is an interpreted language. The more work you can shove down to Perl, the faster it will be. If you code Perl like it was C, you will run significantly slower than if you code Perl like it is Perl. For instance your loops would all be significantly faster if you wrote the as Perl-style foreach loops rather than C-style for loops. Your multiplication speed-ups are likely to wind up slower than the native simply because accessing a variable repeatedly is not free in Perl. And without measuring I would expect that for your task the array can be initialized more quickly with:

@ary = map init_val($_), 0..10_000;

Next, focus on algorithms. By learning to take advantage of things like native hashing to write efficient algorithms you can reliably find more scalable ways to write things. In C people tend to do a lot of scanning of arrays simply because complex data structures are so much work. Perl does not have that limit.

In many problems you can parallelize. There is a basic principle called the Pareto principle. You can achieve 80% of the gain for 20% of the work. That applies in general. But it is really true for parallelizing. If you have a job that will take 10 hours and you can break it into 5 pieces that can run on 5 machines, you will save yourself a lot of time. Making it a finely grained, multi-threaded beast is more likely to cost performance than gain it. (Threading improves latency, not throughput.)

And finally if you have a bottleneck, try rewriting it using Inline::C. (With which you can also inline assembler for true non-portability.) But only do this with identified bottlenecks when you absolutely need to. Odds are that if you try to rewrite your program at the first opportunity you will lose on speed simply because you do not have time to identify other optimizations, you do not have the freedom to play with algorithmic improvements, you will spend longer tracking down bugs, and you lose track of the point where your program is fast enough and you can move on.

And the very last point. If speed is of the essence and you are not dealing with a problem where Perl's RE engine is buying you a lot, then Perl probably isn't the right language to use. Use the right tool for the job, and for high performance that isn't Perl.

Replies are listed 'Best First'.
Re (VSarkiss)2: Optimizations and Efficiency
by VSarkiss (Monsignor) on Jun 30, 2001 at 07:21 UTC

    tilly, I wish I had ++ points left today. You've made excellent points; I agree with you 100%. If I had to pick one statement from McConnell's book that you linked to, it would be the quote from Donald Knuth1:

    Premature optimization is the root of all evil
    I have just this to add: If your code is running slowly, don't guess where the slow spot is, measure. Far too many times, I've seen people confronted by a performance problem take off and try to optimize what they thought was the slow part, instead of trying to profile the program to validate their belief. Then, after reasonable meausrements are made, it turns out their optimizations just made the code complicated and hard to maintain, without making a dent in its run time.

    I remember reading about a study, where programmers, asked to guess where they thought a slow program was spending its time, guessed wrong 80% of the time. (I can't seem to find a reference; perhaps someone here can compensate for my Alzheimer's?)

    Here's an exercise: suppose I want to add all the integers between n and m. Try guessing which of these will run faster, then use Benchmark to validate your guess. The results will probably surprise you.

    use constant N => 1; # for example use constant M => 10_000; # for example my $sum; # Method 1: a simple for loop. $sum = 0; for (my $i = N; $i <= M; $i++) { $sum += $i; } # Method 2: an evil use of map by throwing away its result. $sum = 0; map { $sum += $_ } N..M; # Method 3: the APL way $sum = eval join('+', N..M);

    1Hmm, I just realized how strange that sounds; my pick from one author's book is a quote from someone else.

      Premature optimization is the root of all evil

      A favorite of mine, too, but is it even Knuth? I've seen it attributed to at least two other authors as well.

      Regardless, here's my big question:

      Do you even NEED to optimize it?

      So many programmers sit and twiddle their bits in either the wrong place, or in entirely the wrong code.

      • If the program's not slow, it doesn't need optimizing.
      • If the program's run only occassionally, it probably doesn't need optimizing.
      • If it's not impacting anything by being slow, it doesn't need optimizing.
      Bit-twiddling is fun, but for those of us who are professional programmers, there's no reason to waste our valuable time squeezing cycles from the program Just Because It's There.

      xoxo,
      Andy
      --
      I was dreaming when I wrote this, so sue me if I go too fast.

      For further fun try:

      # Method 3a: the vaguely Mathematica-ish way $sum = 0 + join('+', N..M);
      which generates different results, on my machine anyway.

      One ought to also compare push, pop, shift, and unshift as well as while, for, foreach, do, and, shudder, a hand crafted LABEL loop.

Re: Optimizations and Efficiency
by Abigail (Deacon) on Jul 03, 2001 at 03:02 UTC
    The third bit of advice is that Perl is an interpreted language.

    No, it is not. Nor does this statement have any relevance with the rest of the paragraph. When you call perl with a Perl program, perl will first compile the Perl code, then execute it. If it was interpreted, you wouldn't need BEGIN blocks, and syntax errors would go unnoticed until the program actually reaches there.

    -- Abigail

      First a relevant quote from perlfaq1 (from 5.005_03 if it matters)
      Finally, because Perl is frequently (but not always, and certainly not by definition) an interpreted language, you can write your programs and test them without an intermediate compilation step, allowing you to experiment and test/debug quickly and easily. This ease of experimentation flattens the learning curve even more.
      and later
      Perl programs are (usually) neither strictly compiled nor strictly interpreted. They can be compiled to a byte-code form (something of a Perl virtual machine) or to completely different languages, like C or assembly language. You can't tell just by looking at it whether the source is destined for a pure interpreter, a parse-tree interpreter, a byte-code interpreter, or a native-code compiler, so it's hard to give a definitive answer here.
      And yes, I know you knew both of those things.

      My point in the quotes is that there are several different things that people can mean by interpreted.

      What your observation about BEGIN blocks etc has to do with is the traditional kind of interpreted language that runs line by line. And no, Perl is not that kind of interpreted language, nor would I be so silly as to think so. I am well aware that Perl has separated out compiling and running code, and I am generally up on what kinds of things happen at each stage. (As well I am aware and capable of taking advantage of the fact that what is run time for one bit of code is compile time for another.)

      The second kind of interpreted that people could mean is that you execute a Perl program by running it through a Perl interpreter. Well you do not have to run Perl that way, but perl is the Perl interpreter (so labelled in both common usage and in the documentation) and most Perl programs are, in fact, executed by running them through perl. So as the FAQ says, Perl is usually (but not by definition) an interpreted language - in development work you do not have a separate compile phase. This fact has significant implications for optimizing Perl, the optimization time is seen as part of Perl's startup and so perl simply cannot afford to aggressively optimize. (Else things like loop unrolling would be as irrelevant to Perl as they are for, say, C with a decent compiler.)

      The third kind of interpreted is what I actually was thinking of though, that internally you execute Perl through a byte-code interpreter. Of course in theory you do not have to execute Perl this way, but there is so far no widely available alternative. (Some of the Perl 6 back ends are supposed to change that. But they don't exist yet.) And byte-code interpretation presents a constant overhead.

      So no, my paragraph did not strictly depend on whether or not Perl is interpreted. But the fact that Perl is usually executed through an interpreter, and the fact that the interpreter internally does byte-code interpretation, are both relevant to the fact that when you can it is usually best to push as much low-level detail and manipulation as you can down to perl.

        Well, yes, but you should have qualified that, as most people will think that "interpreted code" is something like what the shell does, and that Java is compiled, while with your definition Java would be interpreted as well.

        -- Abigail