in reply to Re^3: Benchmarking A DB-Intensive Script
in thread Benchmarking A DB-Intensive Script

Ahh, I think I see it now. So the idea would be to store all possible pairs (e.g. 2.5 billion) in a hash, then delete each pair after its been selected? That would be pretty memory intensive at the beginning, but peter-out by the end.

Given the testing density (100 million out of 2450 million = ~4%) perhaps the slowdown at the end won't be too considerable -- only one duplicate hit for every 24 new ones.

  • Comment on Re^4: Benchmarking A DB-Intensive Script

Replies are listed 'Best First'.
Re^5: Benchmarking A DB-Intensive Script
by blogical (Pilgrim) on Mar 15, 2006 at 04:54 UTC
    No. Just store the individuals, tossing ones that will never become part of a pair as you store...

    Ah. I misread "check that this pair has not yet been processed" to mean each item was only a candidate for an initial pairing. I now see that, as I put it, you would have to store pairs... and that's way more work than is necessary unless you need to test for many more.

    *goes to sleep*