The way this post reads to me is that MI is expensive in dynamic languages, and traits are one of many attempts to address that – that if MI were cheap, we wouldn’t need traits.
But what about the complexity issue? Programming has always evolved in the direction of greater abstraction; the complexity of software systems we build today is orders of magnitude greater than that of the artifacts created by any other engineering discipline. (And unlike other disciplines, you can do repetitive work once only, and then factor it away – so the complexity keeps growing. This is why software will never be industrialised like other engineering disciplines (not that they are industrialised even remotely to the extent the software types always seem to think they are, and anyway, I digress).) Even the simplicity of throwaway scripts is deceptive: you have an OS beneath, and they run inside an interpreter which takes care of memory management and many other menial tasks; and neither the OS (to a large extent) nor the interpreter are written in assembly, so you also need a compiler. The amount of work that has gone into making Perl oneliners simple is quite imposing.
Anyway, I’m rambling. The point is that complexity management is by far the most important aspect of designing programming systems (ie. meta-programming), and to me it seems like your post does not go into this at all. You admit that MI becomes unworkable for the programmer in large hierarchies; I believe that’s a much more salient point than its performance.
Makeshifts last the longest.
| [reply] |
The way this post reads to me is that MI is expensive in dynamic languages, and traits are one of many attempts to address that – that if MI were cheap, we wouldn’t need traits.
Then I failed dismally in my attempt to convey what I was trying to say :(
Yes, MI becomes rapidly unworkable in extended hierarchies.
Yes, Traits and it's kin are an attempt to reduce that complexity.
Yes, MI is expensive in dynamic langauges.
But definately NO to if MI were cheap, we wouldn’t need traits..
I did say (twice) that I am convinced that the basic issue that they are trying to address needs tackling.
What I hoped to point out is that there are many subtly different attempts at descriptions for solutions being proferred currently, but that they are concentrating on their differences--which are minutae relative to the problems of performance and footprint that they bring with them.
In compiled languages, where most of the groundwork for Traits and the others is being done, the complexities of method resolution and search order are compile time only costs. By runtime, the method call has been resolved either to a vtable entry, or to the actual entrypoint address.
Perl, (and all dynamic languages), already have a performance problem with method lookup. One of the major benefits that is hoped to come from the Parrot project is a reduction in the overheads associated with the mechanics of subroutine invokation--stack setup, scope maintenance, closure propagation, etc. If that effort succeeds, then it could reduce the costs of calling a sub to the point where static (compile time) MI would be tolorable from a performance point of view, though the inherent brain-melting problem of MI would persist.
It would also make Traits (and similar solutions) a practical solution to that MI complexity--but only if the method resolution can be fully resolved at compile-time
The fear I have is that Perl 6, and other dynamic languages that are trying to adopt trait-like behaviours, are also adding
- Pre & post method entrypoints.
- Dynamic inheritance resolution.
- Dynamic vtables at the class (and maybe instance) level.
- Full introspection.
- Runtime macro capabilities.
Each of these things adds runtime cost to the process of invoking a method.
The code reads:
$obj->method();
The interpreter has to do
- Is method a macro? If so, expland it.
- What is the class of $obj?
That's at least a two levels of indirection. One to look up the class name. One to look up the hash storing the method names associated with that classname.
- Can classX do method?
If not, then look up the list of classes/traits/roles that it might inherit this method from.
Also look up classX' search pattern (depth/breadth/etc.).
For each superthing, lookup the address of it's vtable and check if it can do the method.
If not, then look up the list if it's superthings and their search patterns, and vtables and see if they can supply the method.
Keep going until you find it, or exhaust all possibilities and raise a non-existant method error.
What happens if you find two (or more) superthingies that could provide that method? Now you have to go through a conflict arbitration process.
- Assuming that after all that, we isolated a resolution for the method, we now have to go through lookups for PRE() & POST() methods, and about half a dozen other SPECIAL subs that can be associated with a method or the class it is a part of.
And that lot, even with the inherent loops I've tried to indicate, is far from a complete picture of the processes involved if all the features muted for P6 come to fruition. All of that was just to find the method to invoke.
Now you have to sort out all the actual-to-formal parameter mappings, with all the
- slurpy/non-slurp.
- named/un-named.
- required/optional.
- read-only/read-write
- by-reference/by value.
- defaulted/no-default.
- type constrained.
- range constrained.
- does constrained.
- is constrained.
- will constrained.
possibilities and combinations thereof.
And many of those will themselves require class hierarchy and method resolutions.
Yes, I agree that there is a complexity problem with MI that must be addressed, but I also see huge performance problems arising out of the solutions be proposed, which when combined with all the other performance sapping, runtime costs being added through the desire for even greater introspection and dynamism.
Combined, these mean that the single biggest issue I have with the current crop of dynamic language implemetations, performance--which Perl is currently the best of the bunch--is going to get worse in the next generation, not better.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
In addition to what stvn said, I'd just like to add that the design of the type system and particular the signature system depends on certain guiding principles. One of these
stvn alluded to, which is that it doesn't cost much to wedge in extra options at a point where you're already obligated to make an n-way choice, such as at the dispatcher selection boundary, or at the wrapper/wrappee boundary in &foo.
One principle he kinda glosses over is that we tend to build features into the signature/type system when it replaces code you'd have to write yourself, and probably do a poorer job at. Signature types are not there primarily so that your routine can wring its hands over the shoddy data it is receiving. You can use them for that, of course, but the signature types are there mostly so that the MMD dispatcher can decide whether to call your routine in the first place. That's just an extension of the idea that you shouldn't generally check the type of your invocant because the dispatcher wouldn't call you in the first place unless the type were consistent. By putting the type declaratively into the signature, we can give the information to the MMD dispatcher without committing to a type check where the computer can figure out that it would be redundant.
And the whole MMD dispatch system is there to replace the nested switch statements or cascaded dispatches you'd have to do to solve the problem if you rolled your own solution.
And then it would still be subtly wrong in those cases where you're tricked into imposing a hierarchy on invocants that should be treated equally. The whole Perl 5 overloading scheme is a case study in that sort of error...
Likewise the rest of the signature binding power is provided to
declaratively replace all the boilerplate procedural code that people have to write in Perl 5 to unpack @_.
Even if the declarations just generate the same boilerplate
code and we get no performance boost, we've at least saved the user from having to watch the faulty rivets pop out of their boilerplate.
Not to mention having to stare at all that boilerplate in the first place...
Anyway, those are some of the principles that have guided us. We may have certainly screwed them up in spots, and doubtless we'll find some bottlenecks in the design that we didn't anticipate because we're just too stupid. But as you may have noticed, we're trying really hard to design a language where we can compensate for our own stupidities as we figure them out over the long term. If there's anything that can't be fixed in the design, that's probably it.
| [reply] |
In compiled languages, where most of the groundwork for Traits and the others is being done, the complexities of method resolution and search order are compile time only costs. By runtime, the method call has been resolved either to a vtable entry, or to the actual entrypoint address.
You are mistaken here on two points actually. First, Most of the work on traits is being done using Smalltalk, which is in fact a compiled language, but it is also a very dynamic language. IIRC it does not use any type of vtable or compile time method resolution, but treats all object message sends as dynamic operations. Second, all compiled OO languages do not perform method resolution at compile time, that is a (C++/Java)ism really, and does not apply universally.
I also want to point out that Class::Trait does all of it's work at compile time. This means that there is no penalty for method lookup by using traits. In fact since the alternative to traits is usually some kind of MI, traits are actually faster since the methods of a trait are aliased in the symbol table of the consuming class and so no method lookup (past local symbol table lookup) needs to be performed.
... snipping a bunch of stuff about Parrot and method performance ...
It would also make Traits (and similar solutions) a practical solution to that MI complexity--but only if the method resolution can be fully resolved at compile-time
Well, again, Traits (the core concept, not just Class::Trait) do not have any method lookup penalty really. The whole idea is that you don't have another level of inheritance, so you don't have all the pain and suffering which goes along with it. I suggest you might read the other papers on traits, not just the canonical one Ovid linked too, they provide a much more detailed explaination of the topic.
The fear I have is that Perl 6, and other dynamic languages that are trying to adopt trait-like behaviours, are also adding
- Pre & post method entrypoints.
- Dynamic inheritance resolution.
- Dynamic vtables at the class (and maybe instance) level.
- Full introspection.
- Runtime macro capabilities.
Each of these things adds runtime cost to the process of invoking a method.
What you have just described is essentially CLOS (Common LISP Object System), which is not slow at all. In fact, in some cases CLOS is comparable to C++ in speed, and I would not doubt that in many cases it is faster than Java (and lets not even talk about programmer productivity or compare LOC cause CLOS will win hands down). Java/C++ both suck rocks at this kind of stuff for one reason, and one reason alone. They were not designed to with this way. If you want these types of features in your Object system, you need to plan for them from the very start, otherwise you end up with..... well Java.
It is also important to note that a dyanamic languages can be compiled, and that the concepts are not mutally exclusive. LISP has been proving this fact for over 40 years now.
The code reads:
$obj->method();
The interpreter has to do
- Is method a macro? If so, expland it.
- What is the class of $obj?
That's at least a two levels of indirection. One to look up the class name. One to look up the hash storing the method names associated with that classname.
To start with, macros are expanded at compile time, in pretty much all languages I know of. Sure it might be the second phase of a compile, but it is still before runtime.
Next, an $obj should hold it's class information directly in it's instance. In Perl 5 it is attached to the reference with bless, other languages do it their own way, but in general, an "instance type" will have a direct relation to the class from whence it came. So my point is that while it might be a level of indirection, it is very slight, and certainly should not involve any serious amount of "lookup" to find it.
As for the method name lookup, you are correct, but some kind of lookup like this happens for just about every sub call too (unless of course you inline all your subroutines, which would just plain silly). We suffer a namespace lookup penalty because it allows us to use namespaces which are essential to well structured programming and have been for about 20+ years now. Basically what I am getting at is, you should not add this to your list of "why OO is slow" since it is not really OO that brings this to the table, it is namespaces as a whole.
Can classX do method?
If not, then look up the list of classes/traits/roles that it might inherit this method from.
Whoops,.. you are assuming traits/roles are inherited again. They are not, they are flattened, they have no method lookup penalty.
Also look up classX' search pattern (depth/breadth/etc.).
Ouch! This is a bad bad bad idea, it would surely mean the end of all life as we know it ;)
But seriously, take a look at C3, it (IMO) aleviates the need for this type of "feature".
For each superthing, lookup the address of it's vtable and check if it can do the method.
There you go with those vtable things again, thats just plain yucky talk. Seriously, method dispatching can be as simple as this:
sub dispatch {
my ($obj, $method_name, @args) = @_;
my $class = $obj->class;
foreach my $canidate ($class->get_MRO()) {
return $canidate->get_method($method_name)->($obj, @args)
if $canidate->has_method($method_name);
}
die "Could not find '$method_name' for $obj";
}
Even with Traits/Roles, it really can be that simple (remember, they flatten, not inherit). Sure, you can add more "stuff" onto your object model which complicates the dispatching, but still the core of it doesn't need to be much more than what you see above.
... snip a bunch of other stuff ...
What happens if you find two (or more) superthingies that could provide that method? Now you have to go through a conflict arbitration process.
This is a non-issue, let me explain why. To start with, in Pure OO (no traits/roles), there is no conflict arbitration, if you find it in the current class, that is it, done, end of story. If you add traits/roles it actually doesn't change anything since they are "flattened". By the time the dispatcher gets to them, they are just like any other methods in the class, so normal OO dispatch applies.
Assuming that after all that, we isolated a resolution for the method, we now have to go through lookups for PRE() & POST() methods, and about half a dozen other SPECIAL subs that can be associated with a method or the class it is a part of.
A good implementation of this would have combined the PRE, POST and SPECIAL subs together with the method already (probably at compile time) I know this is how CLOS works. The cost you speak of is really an implementation detail, and (if properly implemented) is directly proportional to the gain you get by using this feature. Always remember that nothing is free, but some things are well worth their price.
And that lot, even with the inherent loops I've tried to indicate, is far from a complete picture of the processes involved if all the features muted for P6 come to fruition. All of that was just to find the method to invoke.
Now you have to sort out all the actual-to-formal parameter mappings, with all the
slurpy/non-slurp.
named/un-named.
required/optional.
read-only/read-write
by-reference/by value.
defaulted/no-default.
type constrained.
range constrained.
does constrained.
is constrained.
will constrained.
possibilities and combinations thereof.
And many of those will themselves require class hierarchy and method resolutions.
A good number of these can and will be resolved at compile time by the type inferencer (remember, Perl 6 will be a compiled dynamic language, just as Perl 5 is today). And of course a properly written implementation means that you will only pay for the features you actually use, so things like type contstraints (subtyping) will not affect you unless you actually use it (and again will likely be something done at compile time anyway).
Keep in mind that many of the features you descibe here, which you insist will slow things down, are features found in a number of functional languages, many of which are really not that slow (and compare to C speed in some cases). Compiler technology and Type checkcing has come a long way since the days of Turbo Pascal, and it is now possible to compile very high-level and dynamic code in say Standard ML or Haskell to very very tight native code. My point, it is not just hardware technology which is advancing.
Yes, I agree that there is a complexity problem with MI that must be addressed, but I also see huge performance problems arising out of the solutions be proposed, which when combined with all the other performance sapping, runtime costs being added through the desire for even greater introspection and dynamism.
Well, I think you are mistaken about these "performance problems" in many cases, but even so, if Traits makes for a cleaner, more maintanable class hierarchy, that is a "performance problem" I can live with. Remember, for many programs, your greatest bottleneck will be I/O (database, file, network, whatever). IMO, only if you are writing performance critical things like Video Games or Nuclear Missle Guidance systems do you really need to care about these "performance problems", and if that is what you are writing, then why the f*** are you writing it in Perl ;)
Combined, these mean that the single biggest issue I have with the current crop of dynamic language implemetations, performance--which Perl is currently the best of the bunch--is going to get worse in the next generation, not better.
To start with Perl is not the fastest, nor is it the most dynamic. If you want dynamic, lets talk LISP, which not only has what Perl has, but it has much of what Perl 6 will have and then some (it certainly has all the features you have descibed above). LISP is not slow, in fact it is very fast. Why? Well, because it is compiled correctly. If we continue to use old, and outdated compiler theory/technology, then all the cool new whiz-bang stuff we want to add onto our language will just slow it down. On the other hand, if we bring our compiler theory/technology up to date with out language design/theory, then it is likely we won't suffer those penalties.
Remember, just because Java/C++/C#/etc. can't do it right, doesn't mean it can't be done.
| [reply] [d/l] [select] |
I only want to address a few of your points here where I think you actually are mistaken in your assumptions.
3. Method resolution gets very messy, very quickly.
Should the inheritance tree be search breadth-first, or depth-first, or depth-within-breadth, or breadth-within-depth, or depth-within-breadth-within-depth, etc.
... snipping a bunch of other stuff ....
You should take a look at the C3 method resolution order (see the SEE ALSO section of Class::C3 for some links). It preserves local precendence ordering of classes so that method lookup becomes much more consistent and predictable from any given point in the class hierarchy, as opposed to other (more common) method resolution orders which can change depending upon where you are looking at them from. In short, it makes MI method resolution order Just Work.
I think your (sort of) proposal for classes to control their own dispatching is a really bad idea for all the reasons you pointed out. IMO it would make it almost impossible for a programmer to understand the path his method resolution would take.
4. If any level of dynamism in the class hierarchy is allowed--introspection, never mind runtime modification--then the overheads are huge.
The cost of maintaining the data required for runtime introspection of a wide and deep MI tree are daunting enough.
... snipping a bunch of other stuff ....
You are making assumptions here about how the introspection mechanism is being implemented. IMO there is absolutely no need to maintain any seperate data for runtime introspection. Sure, that is how C++ and Java do it, but they were never meant to be "dynamic" languages, and any "dynamic" features they have are clobbered on (very poorly too, IMO of course).
You also mention vtables a lot in your discussions, but vtables is not the only way to implement objects/classes. In fact it is a very poor way to do it (again IMO). If you have a cached method resolution order (MRO), especially one like C3 provides, then method lookup is not nearly as painful as Perl 5's depth-first-left-to-right mess. And since your MRO will only change if your class heirarchy changes, then the cost of cache maintaince can be managed quite easily.
In fact the Perl6-MetaModel prototype I created did just this. The MRO would be cleared out and re-generated if and only you changed a class's superclass list, which meant that adding methods to class dynamically (a far more likely usage) would have no penalty. I also memoized the method lookup (since we have a predictable MRO and we can pretty much assume that method lookup is a side-effect free operation), which is really the only other place we needed to deal with cache issues (although I say that with some hesitation since I am sure there is something I am overlooking). This approach is IMO a far better approach that vtables.
| [reply] |