in reply to Benchmarking Simple Perl program vs. Java equivalent

I also ran some benchmarks on this code. The overall time as posted took about 24+ seconds. Executing the loop "mechanics" of the foreach within scenario() takes about 4.6 seconds. The time to calculating the modulus statement about the same. The the memory allocation stuff in the "inlining section" takes about 3x as long as either one of these segments, e.g. 1/5 loop mechanics, 1/5 modulo calculations, 3/5 "saving results of "inlining section"

1. The time to create the hash to begin with is very fast(negligible). But I never see anywhere where the random access nature of a hash is used. Iterating through an array will be much faster than iterating over all the keys of a large hash. Perl allows "ragged 2D" array structures an(AoA) and something like that might be more appropriate if you are interested in speed.

2. Using and iterating over some kind of Array based structure would be considerably faster than all keys of a hash.

3. I was surprised to find how much time the "$gene_to" calculation took. I'm not sure why that is or what could easily be done about it.

4. The "$temp_gene_to_legal_range" calculation is a bit "non-real world" because this allocates a anonymous array containing one or 2 anonymous hashes (which also have to be dynamically allocated and created), But then the $temp_gene_to_legal_range array reference is "thrown away". This leaves the memory still allocated, but in a way that you will never be able to reference it again. So this part of the code is "expensive" due to all the memory structures that are being dynamically created. Krambambuli's code is faster because it eliminates the anon hash allocations.

5. So with 70,000x50x(( 1 array allocation)+(1 to 2 hash allocations-guess 1.5 avg?)), sub scenario() is going to take awhile and it does! Anyway it appears that saving this calculation is so expensive, that you'd be better of calculating it when you need it and then use it right then. Right now its not even saved so I have no idea of the eventual plan to make use of this.

6. I don't know enough about your problem to know what to recommend exactly in the way of alternative data structures, but my benchmarking indicates that this 70,000x50x~2.5 new memory structure allocations is what is taking the time - If I have my decimal point right, that is close to 9 million!. A more sophisticated 2-d hash structure or a Array of Hash or Array of Array of Array may serve the purpose better? It may be faster to extend some existing structure than create many millions of new little ones.

I don't much about Java and can't speak to the relative "apples to apples" or "apples to oranges" comparative nature of your code. I don't know if your Java code calls new memory allocation 9 million times, but your Perl code does.

  • Comment on Re: Benchmarking Simple Perl program vs. Java equivalent

Replies are listed 'Best First'.
Re^2: Benchmarking Simple Perl program vs. Java equivalent
by roibrodo (Sexton) on Jun 20, 2010 at 20:22 UTC
    Thank you very much for the wise words. I will try to follow your advice.