in reply to Re^3: Optimisation (global versus singular)
in thread Optimisation isn't a dirty word.

I should be developing in small end-to-end increments so that I can continually look at whether my application is meeting performance requirements and optimise appropriately.

To a great extent, we are in agreement. Contrast this approach with the oft-quoted "Don't optimise until/unless you have to".

Another aspect of it is that it is not always possible to codify performance requirements in such a way that they can easily be verified during the inner/lower levels of the development process.

For example, when developing libraries and modules, the developers will have no knowledge of the performance requirements of (future) applications that will call their modules. It is all to easy for no consideration to be given to optimisation at this stage because the application for which the library is written has very loose or no performance requirements.

The problem then arises when the next application makes heavy, sustained use of the module and it's tardy performance comes to light, but only once the calling application has been structured around the api of the inner module. At the stage, you have already had to commit enough resources to the calling code in designing and writing enough to actually do some level of performance testing, that replacing the library with an faster alternative is painful.

It is bad enough if you have to re-write and/or optimise the library code, whilst retaining a compatible interface--at least you will mostly be able to stick with the design and code already developed, that allowed you to discover the problem.

But the real problem arises when the performance bottleneck is the API itself. As an example, using hash-based objects to create trees & graphs seems natural and obvious in Perl, but when you try to use those objects to build and manipulate broad, deep trees or big graphs, they're very heavy on memory consumption and greatly lacking in performance. Given that many of the classical, useful algorithms for which trees and graphs are perfectly suited are often in the NP complete class, the poor performance distinctly limits their (re-)usability.

In absentia of actual performance requirements, the best you can do when developing a library intended for large scale re-use, is to optimise it to the best of your ability commensurate with goals of correctness and maintainability:

  • Use the fastest algorithm you can find.
  • Use the fastest coding techniques available.

    Just occasionally, there are sufficient performance gains to be achieved by trading clarity and simplicity for an obscure idiom or complex syntax, that the trade is worth it.

    To be able to do this, having an awareness of what algorithms and techniques yeild the best performance, in your chosen language, is extremely useful. It doesn't mean that you optimise every last line of code you write, or do not often deliberately chose a less than optimal solution, for the purpose of favouring a more important goal for a given piece of code.

    It just means that it is worth taking a quick glance at some of the often seemingly pointless benchmarks that get posted here, and even running one or two of your own if you find yourself unable to choose between two ways of doing something for any other good reason. And re-checking your assumptions now and again.

    It's also worth knowing the costs of over modularisation and over engineering.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
    • Comment on Re^4: Optimisation (global versus singular)