I also ran some benchmarks on this code. The overall time as posted took about 24+ seconds. Executing the loop "mechanics" of the foreach within scenario() takes about 4.6 seconds. The time to calculating the modulus statement about the same. The the memory allocation stuff in the "inlining section" takes about 3x as long as either one of these segments, e.g. 1/5 loop mechanics, 1/5 modulo calculations, 3/5 "saving results of "inlining section"

1. The time to create the hash to begin with is very fast(negligible). But I never see anywhere where the random access nature of a hash is used. Iterating through an array will be much faster than iterating over all the keys of a large hash. Perl allows "ragged 2D" array structures an(AoA) and something like that might be more appropriate if you are interested in speed.

2. Using and iterating over some kind of Array based structure would be considerably faster than all keys of a hash.

3. I was surprised to find how much time the "$gene_to" calculation took. I'm not sure why that is or what could easily be done about it.

4. The "$temp_gene_to_legal_range" calculation is a bit "non-real world" because this allocates a anonymous array containing one or 2 anonymous hashes (which also have to be dynamically allocated and created), But then the $temp_gene_to_legal_range array reference is "thrown away". This leaves the memory still allocated, but in a way that you will never be able to reference it again. So this part of the code is "expensive" due to all the memory structures that are being dynamically created. Krambambuli's code is faster because it eliminates the anon hash allocations.

5. So with 70,000x50x(( 1 array allocation)+(1 to 2 hash allocations-guess 1.5 avg?)), sub scenario() is going to take awhile and it does! Anyway it appears that saving this calculation is so expensive, that you'd be better of calculating it when you need it and then use it right then. Right now its not even saved so I have no idea of the eventual plan to make use of this.

6. I don't know enough about your problem to know what to recommend exactly in the way of alternative data structures, but my benchmarking indicates that this 70,000x50x~2.5 new memory structure allocations is what is taking the time - If I have my decimal point right, that is close to 9 million!. A more sophisticated 2-d hash structure or a Array of Hash or Array of Array of Array may serve the purpose better? It may be faster to extend some existing structure than create many millions of new little ones.

I don't much about Java and can't speak to the relative "apples to apples" or "apples to oranges" comparative nature of your code. I don't know if your Java code calls new memory allocation 9 million times, but your Perl code does.


In reply to Re: Benchmarking Simple Perl program vs. Java equivalent by Marshall
in thread Benchmarking Simple Perl program vs. Java equivalent by roibrodo

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.