Re: Re: Best Pairs

Try printing out the values of your so-called intermediate sized cache %sets in your example. The expected size of your cache (for randomly distributed datasets) equals the size of the original dataset. The AM solution above is more efficient in both space and time (but for large N and sparse data, it should use a hash rather than an array to count frequencies).

Comment on Re: Re: Best Pairs Download Code