in reply to Re^2: Remove Duplicates from Array
in thread Remove Duplicates from Array

And List::MoreUtils::uniq can be faster as it tries to load a library to implement its functionality via DynaLoader. If that fails it implements a plain perl way.

In my test (linux, perl 5.8.8 List::MoreUtils 0.21) the original List::MoreUtils::uniq is about 400% faster than my perl implementation.

If I rename the library, so List::MoreUtils must rely on its perl implementation, my solution is about 20% - 25% faster.

I don't want to argue against List::MoreUtils; but now I wonder about these two (perl) solutions:

# presented in perlfaq4 - How can I remove duplicate elements from a l +ist or array? sub my_uniq { my %h; grep { !$h{$_}++ } @_; } # vs. # List::MoreUtils::uniq sub LM_uniq { my %h; map { $h{$_}++ == 0 ? $_ : () } @_; }

I can't recognize an advantage in the usage of map and the ternary operator.

edit: text refined

Replies are listed 'Best First'.
Re^4: Remove Duplicates from Array
by mpeever (Friar) on Nov 01, 2008 at 17:10 UTC

    Well looky there... you know, it never occurred to me to put an empty list into a list with map to skip an entry. I was thinking in terms of Lisp, where adding '() gives you a nil entry. I assumed the result would have been to add 0 or something, even though I knew

    map { ( 0..$_ ) } ( 1..3 );
    yields
    (0, 1, 0, 1, 2, 0, 1, 2, 3)

    Cool.

Re^4: Remove Duplicates from Array
by JadeNB (Chaplain) on Nov 02, 2008 at 20:21 UTC
    # List::MoreUtils::uniq sub LM_uniq { my %h; map { $h{$_}++ == 0 ? $_ : () } @_; }
    This seems like a strange construction (and yet, as you point out, it's what's in the List::MoreUtils source). Surely $h{$_}++ == 0 is the same as ! $h{$_}++ in this setting? I thought the main point of using the List::MoreUtils uniq was to avoid edge cases, but this seems to do nothing but replace a grep with an essentially equivalent map. (In particular, it doesn't do anything to avoid stringification of objects.)

      Well, the Description for the module explicitly tells us two things:

      1. the functions are fairly trivial
      2. their efficiency is due to their implementation in C
      All of the below functions are implementable in only a couple of lines of Perl code. Using the functions from this module however should give slightly better performance as everything is implemented in C. The pure-Perl implementation of these functions only serves as a fallback in case the C portions of this module couldn't be compiled on this machine.

      I'm deducing from the description that the Perl implementations aren't really intended to be that efficient.

        I agree with you, in the sense that I'm not surprised if the pure-Perl List::MoreUtils::uniq doesn't blow a naïve implementation out of the water, speed-wise; but, like dragonchild, I thought that at least by using these functions one should get handling for unusual arrays, such as those containing arrays references—and this code doesn't provide that (or, at least, doesn't do so any better than Justin's original implementation).

        UPDATE: A conversation with dragonchild showed that my meaning was a little unclear, so let me try again. I realise that the existing code does correctly return objects, rather than their stringified versions, but it seems to me that it could incorrectly confuse (say) the string "ARRAY(0x18045c0)" with an arrayref.