in reply to Re: Re^3: Multilevel flexibillity
in thread Multilevel flexibillity

This is an attempt to formalize the argument of Abigail-II - it obviously has a flaw, but can be an starting poing for further analysis. First we need a to define the complexity of a design - I would take for it the Kolmogorov complexity (in Perl) i.e. the character count of the shortest Perl program complying with it. For the definition of design I would take an additionall set of rules that the program has to comply with.

Now that is sure that a problem without any additionall requirements on the solution program is of less complexity than one with some additionall requirements. This of course holds when the requirement is the plugin architecture.

The problem is if the Kolmogorov complexity is really the complexity perceived by humans.

Replies are listed 'Best First'.
Re: Re: Re: Re^3: Multilevel flexibillity
by tilly (Archbishop) on Jun 26, 2003 at 17:37 UTC
    Considering the difficulty that humans have in telling whether they have the shortest solution (see your average golf game for evidence, or articles that can be found here for the theory), it is certain that the Kolmogorov complexity is not perceived by humans. Underscoring that is the fact that things which bring you closer to that ideal solution often make the code harder to understand, not easier.

    Furthermore you are attempting to specify the complexity of the design used to satisfy the requirements in terms of the requirements given. But haven't you ever seen two pieces of code designed to do the same thing of vastly different complexity?

    My understanding of the issue is rather different. Mine is shaped by an excellent essay by Peter Naur (the N in BNF) called Programming as Theory Building which I was was available online (I read it in Agile Software Development). It isn't, so allow me to summarize it then return to my understanding.

    Peter's belief is that the programmer in the act of programming creates a theory about how the program is to work, and the resulting code is a realization of that theory. Furthermore he submits (and provides good evidence for) that much of the way that other programmers do and do not successfully interact with the code may be understood in terms of how successful they are at grasping the theory of the code and working within that. For instance a programmer without the theory will not be able to predict the correct behaviour. The programmer with will find that obvious, and will also have no trouble producing and interpreting the relevant piece of documentation. The mark of failure, of course, is when the maintainance programmer does not grasp the theory, has no idea how things are to work, and fairly shortly manages to get it to do various new things, yes, but leaves the design as a piece of rubble. Therefore one of the most important software activities has to be the creation and communication of these theories. How it is done in any specific case need not matter, that it is done is critical.

    So a program is a realization of its design, which functions in accord with some theory, and the theory needs to be possessed by developers (and to some extent users) to understand what the program is supposed to do, and how to maintain it. How does this shed light on the problem of a plug-in architecture?

    It is simple. A program with an internal plug-in architecture is a program whose theory embodies a generalization. Adding the generalization takes work, yes. But with the right generalization, many things that your theory must account for become far easier to say. (The wrong generalization on the other hand...?) If you have enough special cases that are simplified, the generalization pays for itself, and being able to find then work with such generalizations is key to being able to work efficiently. Just like a self-extracting zip can be shorter than the original document. There is overhead to including the decompression routine, but it saves you so much that you win overall.

    Of course I am describing what can happen, if things turn out right. Generalizations are not always good. To the contrary when used by overenthusiastic people, they often become conceptual overhead with little return. (You are in a maze of twisty APIs, all alike...)

      Thanks for great theory about understanding programs - I think I'll start using it in practice just today. This is something we all know, but never formulate it accurately.

      But still we don't have any definition of what is human perceived complexity of software. I see one candidate - the number of axioms in the program theory. But again when we add requirements we cannot hope to make the number of axioms lesser.

      Actually the only way to measure the complexity of design is, for me, by taking the lowest bound of the complexity of programs complying with that design. Adding requirements to the design can only result in a subset of the programs complying with it - thus it can only make the complexity bigger, never lesser. No matter what measure for program complexity we use.

      Update: A quote justifying Kolmogorov complexity from the Abstract of On the intelligibility of the universe and the notions of simplicity, complexity and irreducibility by Gregory Chaitin (found via the link you provided):

      (...) we defend the thesis that comprehension is compression, i.e., explaining many facts using few theoretical assumptions, and that a theory may be viewed as a computer program for calculating observations. This provides motivation for defining the complexity of something to be the size of the simplest theory for it, in other words, the size of the smallest program for calculating it.

      I recommend the whole article (there is some comment on Wolfram works too).

        In the same article he lists implicit assumptions that algorithmic information theory (AIT) makes, including the follwing whopper.
        Also, in AIT we completely ignore the time taken by a computation, concentrating only on the size of the program. And the computation run-times may be monstrously large, quite impracticably so, in fact, totally astronomical in size. But trying to take time into account destroys AIT, an elegant, simple theory of complexity, and one which imparts much intuitive understanding. So I think that it is a mistake to try to take time into account when thinking about this kind of complexity.
        And this is where we part company.

        I fully agree with him that comprehension and compression are very tightly interlinked. I have said similar things before (eg at Re (tilly) 3: Maintainable code is the best code -- principal components). But if it ruins the elegance of his theory to take obvious factors like time into account, well reality is messy. Because obviously in the real world time does matter.

        To take a well-known example, consider chess. There is an absolute theory of chess, it is fairly simple. You have a list of pieces, you know how they move, you have a board, you know its initial set-up. It is not that hard to write a program to do a brute-force search and discover the perfect way to play. Such programs have been written.

        The actual theory of chess is much more complex though. There is far less certainty (this is the perfect move). There are lots of semi-vague concepts like "material advantage", "pawn position" and "light square weakness".

        Why? Because time matters. People can only think through a certain amount. We therefore substitute our ability to work with complex concepts for time we don't have. Computers operate faster and simpler. Good chess computers are somewhere between human practice and the theoretical simplicity. Like humans they have to stop analysis somewhere in real life and therefore rely on heuristics. However they can analyze farther and find heuristics harder, so they do better with a simpler theory and more brute force.

        As Chaitin notes, including time ruins the simplicity of AIT. It introduces all sorts of hard to quantify issues that vary depending on the observer. But I am about to make it worse. I don't believe that programs have any intrinsic attribute that you can really call "simplicity". More precisely, I still agree with my description in the above node:

        And so it is with programming. Well-written source-code is a textual representation of a model that is good for thinking about the problem. It will therefore be fairly efficient in its representation (although the text will be inefficient in ways that reduce the amount of information a human needs to understand the code). Being efficient, functions will convey a similar amount of surprise, and the total surprise per block is likely to be fairly large.

        In short, there will be a good compression in the following sense. A fixed human effort spend by a good programmer in trying to master the code, should be result in a relatively large portion of the system being understood. This is far from a compact textual representation. For instance the human eye finds overall shapes easy to follow, therefore it is good to have huge amounts of text be spent in allowing instant pattern recognition of the overall code structure. (What portion of your source-code is taken up with spaces whose purpose is to keep a consistent indentation/brace style?)

        In other words I judge the complexity, maintainability, etc of code in terms of interacting with it. Because in the end that is what I or someone else has to actually do. Which is horribly subjective. It even varies depending on the observer. (Code using unfamiliar techniques may be very non-simple to you, even though it is short and simple to someone who knows those techniques.)

        Which is about as far away from AIT as you can get. :-)