Re^3: Benchmarking A DB-Intensive Script

You're close enough that understanding the big picture may get you the rest of the way there as well. :-)

Try running two copies of the script at the same time. See how fast they go. Then three. Then four. Find the point where you don't run faster by running more copies.

Each one will write the examples it finds to its own file. It is trivial to afterwards go and remove duplicates from those files. (Sort each pair in each line alphabetically so that a pair always is on a line that looks the same. Then do a sort -u to find and remove duplicates.)

Comment on Re^3: Benchmarking A DB-Intensive Script