in reply to Re^2: performance issues with grepping large sorted arrays
in thread performance issues with grepping large sorted arrays

The problem still is that the OP said they sorted the array just to try to get the grep to be faster.

Yes, binary search is faster than a straight linear search.

No, it's not going to be as fast as eliminating 1000 comparisons altogether.

for ( @array ) { if ( exists $hash{$_} ) { # do something } else { # do something else } }
Here, you have one hash lookup for each array item.
@array = sort @array; foreach ( keys %hash ) { if ( is_in_array( $_ ) ) { # do something } else { # do something else } }
Here, you're doing a sort (quicksort, if using Perl's sort(), which is O(n log(n)) and worst case O(n**2) ), then using binary search which is O(log2 n) for each key of the hash.

So your solution ends up being O( (n log(n)) + (m * log n) ) with a worst case of O(n**2 + m * log(n)). Just going through the loop and doing a hash lookup is predictably O(n).

In simplest terms, with an 800k element array you can do 800k comparisons the way GrandFather and I have been recommending. In order to do a quicksort then a binary search for each key of the hash, you're possibly talking about 800k * 800k + 1000 * log2(800k). log2(800,000) is roughly 20. So 640,000,020,000 comparisons in the worst case, even using fairly good sort and search algorithms.

The loop is simply inside out.