in reply to Re^3: Generating Unique numbers from Unique strings
in thread Generating Unique numbers from Unique strings

You asked: "BTW, does perl caches the string hash value?". No, I don't think so. And there is no need to do so because the C equation that calculates the integer hash value runs like a rocket! I've got C code that uses and benchmarks 2 of the recent Perl hash functions, but the computer that it is on, is dead right now! Bummer. To calculate the hash function is very, very fast. I experimented with different C formulations of the equation and found that gcc even on lowest optimize setting coded them essentially the same. On a modern Intel processor even a int*2 is as fast as a left shift 1. I was amazed, but decades of development have gone into that thing and literally millions of transistors!

Before writing extra code that you think will make a performance difference, benchmark the code and see where you are at. Start with the most straight forward HLL code that implements a reasonable algorithm. Use the features of the language because they have been highly optimized and are likely to produce good results with a clear algorithm.

An array access by index will be faster than a hash table lookup, but not by all that much unless you do this a bazillion times. BenchMark your code and see for yourself.

Update: in my experience the 80/20 or even the 90/10 rule applies. 10% of the code does 90% of the "real work". Forget about optimizing the 90%, you must find the 10% where it really matters.

  • Comment on Re^4: Generating Unique numbers from Unique strings

Replies are listed 'Best First'.
Re^5: Generating Unique numbers from Unique strings
by BrowserUk (Patriarch) on Apr 03, 2016 at 19:52 UTC
    An array access by index will be faster than a hash table lookup, but not by all that much unless you do this a bazillion times.

    Accessing an array is measurably, and for some things substantially, faster than accessing a hash; but the big problems with using an array for the OPs usage are:

    1. Creating an array that has 4 billion slots;
    2. Or truncating the 32-bit value to some more reasonable size and then dealing with collisions.

    Perl's hashes take care of this; and do so very efficiently.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I agree completely.