>> 76% of the pixels have unique colors! This makes your hashing algorithm rehash everything when it lands on a dup.

>Sorry, but if you mean "rehash every preexisting key", you are wrong. Why do you believe that?

As I recall, when a hash algorithm is selected, there is a tradeoff between performance and probability of uniqueness. Some fraction of the data up to and including the entire key may be used in the hash key.

If all of the distinct keys tried actually give non-colliding hash values, then the tradeoff worked. Otherwise, another algorithm must be selected which uses either more of the key or a different algorithm.

That triggers a total recalc. No?

Not even the Perl Gurus can know in advance that your data would have 8000 distinct colors and that mine would have 27 million.

And, hashing 48 bit quantomly random data has to be much more than 2.0 times as hard on a hashing algorithm as 24 bit data. There are 3300 times more buckets to keep track of.

Working in 16E6 color space does not in any way seem like it should be half as hard as 281474976710656 color space.

What I was seeing in my ill-fated C attempt was dramatically longer run times with modest increases in data volume from the ever expanding looup table. That is why I was looking for a hashing formula!

>> If you've already trimmed your OP time of "2.17 hours" to 48 seconds, why have you wasted our time

When I asked the question, the run-time was hours. Way beyond my nano-scale attention span. While The Monks were busy writing up many questions, I was beaverishly instrumenting my code to find out where all the time was being squandered. Qsort was taking 98% of the time!

Replacing <dumb old> QSort with a brilliantly conceived "Shuffle Merge" (TM :) and increasing my MAX_UNSORTED value from 200 (D'oh!) to a more workable 7920, I was able to realize the astonishingly better run time of roughly a minute. 120 X Faster? Dang!

Make your first draft the klunkiest, cloddiest, most horrendously heinous code possible because then you have no place to go but UP!


In reply to Re^4: Perl Hashes in C? by Anonymous Monk
in thread Perl Hashes in C? by BrianP

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.