Re^4: Benchmarking A DB-Intensive Script

And I think that thundergnat is wrong.

With the random strategy, the odds of repeating the select go from 0% to 4%. So, you're going to need to pull about 102 million pairs by the time you're done. (Slightly higher than that, but below 104 million.) That's 102 million hash lookups following fast operations.

The alternate strategy involves doing 2.45 billion hash lookups just to start. That preprocessing step is more than the 2 million redos that you're hoping to save with a better algorithm.

Random then redo is just fine for his needs. It would only be worth revisiting that plan if he was going to be searching a large part of his search space.

Comment on Re^4: Benchmarking A DB-Intensive Script

Replies are listed 'Best First'.
Re^5: Benchmarking A DB-Intensive Script by blogical (Pilgrim) on Mar 15, 2006 at 05:04 UTC
I misread "check that this pair has not yet been processed," and agree that pre-processing all pairs is not a good approach- I had thought you could remove individual pairs. But if he did take that approach, he would be able to say "my script will finish," which you can't do if you retain known invalid pairs. It could sit there pulling the same candidates, if it is truly random... but then, Eris needs her toys. And just because one of my friends recently said he'd never seen it posted on a forum, I feel obliged to throw this out here too: I was wrong. :)	[reply]

Replies are listed 'Best First'.

Re^5: Benchmarking A DB-Intensive Script
by blogical (Pilgrim) on Mar 15, 2006 at 05:04 UTC

I misread "check that this pair has not yet been processed," and agree that pre-processing all pairs is not a good approach- I had thought you could remove individual pairs.

But if he did take that approach, he would be able to say "my script will finish," which you can't do if you retain known invalid pairs. It could sit there pulling the same candidates, if it is truly random... but then, Eris needs her toys.

[reply]