in reply to Limit the size of a hash

There is an anomaly in what you are seeking to do.

And 8GB file of 12 character lines gives ~715 million lines.

With the selection value being a value between 0.00 & 0.99, there are only 100 possible values; which means that there could be an average of ~7 million of each value. Which 10 of the 7 million 0.99 value liens do you want?


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Limit the size of a hash
by brx (Pilgrim) on Sep 05, 2013 at 09:46 UTC

    Pigeonhole principle :-)

    English is not my mother tongue.
    Les tongues de ma mère sont "made in France".

      Or perhaps the OP actually wants all the records that contain the top 10 values.

      Eg. ~7 million * 0.99; ~7 million * 0.98; ...


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Limit the size of a hash
by nevdka (Pilgrim) on Sep 06, 2013 at 01:46 UTC

    That assumes a flat distribution. If it was normally distributed with a sufficiently small standard deviation (0.1 would work, according to the back of my envelope) then talking about the top 10 would make sense.