dshahin has asked for the wisdom of the Perl Monks concerning the following question:

I almost always use hashes (or hash refs) to implement my objects. Auto-vivification issues aside, I really dig the conveniece of naming my constructor arguments in a meaningful way. That aside, I know there are performance issues regarding hash overhead. How much do I have to worry, and what does stepping down to an array save me?

I mean, in practical terms, do array implementations scale better than hash-based ones? Does passing a reference mitigate the overhead somewhat?

thanks,
dshahin

Replies are listed 'Best First'.
Re: best structure for classes?
by chromatic (Archbishop) on Apr 19, 2001 at 23:57 UTC
    There are two components to consider. First, hashes take more memory than arrays (assuming they store similar things). Second, it's slower to access an element in a hash.

    On the other hand, it's a lot more convenient to look things up by key in a hash when you're dealing with stuff that lends itself to being named and accessed out of a particular order.

    Having to iterate over an array to find an appropriate element repeatedly can kill the performance benefits.

    Pseudohashes are one solution, but they're not widely used or appreciated, so it's hard to recommend them. A better approach is to use the constant pragma to alias array indexes to names, and only access them through the names. This has the additional benefit or drawback of not allowing autovivification of hash keys within the object.

    Most of what I use are hashes. Speed of data member access isn't a bottleneck for most of my stuff.

    (Oh, and passing around a blessed reference doesn't make any difference in this case. The underlying access mechanisms do.)

      A better approach is to use the constant pragma

      Try using Benchmark on an array using constant to access the indexes. From what I've seen it lowers to to the performance of a hash (unless we're talking 10000 keys).
      I would stay with hashes as you get all of the other benefits.
Re: best structure for classes?
by Rhandom (Curate) on Apr 19, 2001 at 23:33 UTC
    You really should look at using the Benchmark.pm module. This will give you the relative comparisons on your own system.

    On my system, an array with 20 items was 10-20% faster than a hash with 20 twenty items. Going up to 10000 items made arrays 700% faster in some cases. But, if you are using constants to access the index of your rows (as in use constants), then arrays do you no better.

    OK, so arrays are faster. However the ease of asking for a key instead of an index is wonderful. Plus if you consider that most of the time your methods will take significantly longer to run in comparison to calling the methods, then you will find that it really doesn't matter. If you are concerned with the micro seconds that it takes to call the method, then you should consider not using method calls at all as function calls are 20 faster than methods.
Re: best structure for classes?
by satchboost (Scribe) on Apr 20, 2001 at 00:24 UTC
    In Advanced Perl Programming, Sriram Srinivasan has a very interesting idea for object-oriented storage. Instead of focusing on the object as the atom, he focuses on the class, using a series of package-global parallel arrays, each array being a given attribute. The "object" is now a blessed scalar reference containing the index within the parallel arrays.

    The implementation he suggests in his book does NOT work. I worked on it and have an implementation that does, quite nicely. (I have to finish cleaning up the import function, but that's a matter of when I get around to it. I'll post it when I'm done.)

    One thing to note is that different implementations are best for different things. The hash method is best when you have a class that will instantiate few objects, the objects are relatively static (don't get created and destroyed often), and have a relatively large number of attributes. The parallel array method is better for a class that will have a large number of objects, will create and destroy them on a regular basis, and has a small number of attributes. I've never worked with an array-ref type object, so I don't know much about them.

    In case you're wondering, no - I haven't run it through Benchmark. I'll have to do that, soon.

Re: best structure for classes?
by suaveant (Parson) on Apr 19, 2001 at 23:31 UTC
    The best way to tell is make a couple of test objects, and run them through Benchmark... that aside, unless you are REALLY looking to tweak every little bit you can out of an object, hashes work fine. I prefer the ease of use and readability of hashes, which so far has always been more important than a small efficiency gain.
                    - Ant
Re: best structure for classes?
by how do i know if the string is regular expression (Initiate) on Apr 19, 2001 at 23:34 UTC
    At work, I updated my code from using hash based to array based objects after going to a talk by Damian Conway. It improved the speed of my program by about 33%. I highly recommend it.

    - FrankG