in reply to Failed array attemp

A number of replies use the grep built-in to scan an array for the presence of an element of another array. The problem with  grep is that it will always scan the entire array even though the OPer only seems interested in the first occurrence of the element in the scanned array. List::MoreUtils::any will stop scanning at the first occurrence. For some value of the product of the sizes of the two arrays (100,000? 1,000,000? ...? Benchmark to find out), this difference in behavior will result in a significant performance win for any. For sufficiently small arrays, the difference is trivial.

Replies are listed 'Best First'.
Re^2: Failed array attemp
by sauoq (Abbot) on May 14, 2012 at 00:21 UTC
    this difference in behavior will result in a significant performance win for any.

    It'll only save a constant factor and so it'll still be O(XY) whereas using a temporary hash would be O(X).

    The cost of using a hash would be additional storage on the order of O(Y) but, as you are already storing the array, this increase is just a constant factor.

    Comparatively, stopping at the first found element is better the more duplication there is in the array. But the more duplication in the array, the less the storage cost of using a temporary hash instead (for the theoretically significant improvement.)

    If you want efficiency, and aren't so short on storage that you can't do it, using a temporary hash is almost certainly the way to go.

    -sauoq
    "My two cents aren't worth a dime.";