Re^3: Remove Duplicates from Array

And List::MoreUtils::uniq can be faster as it tries to load a library to implement its functionality via DynaLoader. If that fails it implements a plain perl way.

In my test (linux, perl 5.8.8 List::MoreUtils 0.21) the original List::MoreUtils::uniq is about 400% faster than my perl implementation.

If I rename the library, so List::MoreUtils must rely on its perl implementation, my solution is about 20% - 25% faster.

I don't want to argue against List::MoreUtils; but now I wonder about these two (perl) solutions:

# presented in perlfaq4 - How can I remove duplicate elements from a l
+ist or array?
sub my_uniq {
  my %h;
  grep { !$h{$_}++ } @_;
}

# vs.

# List::MoreUtils::uniq
sub LM_uniq {
  my %h;
  map { $h{$_}++ == 0 ? $_ : () } @_;
}
[download]

I can't recognize an advantage in the usage of map and the ternary operator.

edit: text refined

Comment on Re^3: Remove Duplicates from Array Download Code

Replies are listed 'Best First'.
Re^4: Remove Duplicates from Array by mpeever (Friar) on Nov 01, 2008 at 17:10 UTC
Well looky there... you know, it never occurred to me to put an empty list into a list with `map` to skip an entry. I was thinking in terms of Lisp, where adding `'()` gives you a nil entry. I assumed the result would have been to add 0 or something, even though I knew `map { ( 0..$_ ) } ( 1..3 );` [download] yields `(0, 1, 0, 1, 2, 0, 1, 2, 3)` [download] Cool.	[reply] [d/l] [select]
Re^4: Remove Duplicates from Array by JadeNB (Chaplain) on Nov 02, 2008 at 20:21 UTC
`# List::MoreUtils::uniq sub LM_uniq { my %h; map { $h{$_}++ == 0 ? $_ : () } @_; }` [download] This seems like a strange construction (and yet, as you point out, it's what's in the List::MoreUtils source). Surely `$h{$_}++ == 0` is the same as `! $h{$_}++` in this setting? I thought the main point of using the List::MoreUtils `uniq` was to avoid edge cases, but this seems to do nothing but replace a `grep` with an essentially equivalent `map`. (In particular, it doesn't do anything to avoid stringification of objects.)	[reply] [d/l] [select]
Re^5: Remove Duplicates from Array by mpeever (Friar) on Nov 03, 2008 at 00:18 UTC
Well, the Description for the module explicitly tells us two things: the functions are fairly trivial their efficiency is due to their implementation in C All of the below functions are implementable in only a couple of lines of Perl code. Using the functions from this module however should give slightly better performance as everything is implemented in C. The pure-Perl implementation of these functions only serves as a fallback in case the C portions of this module couldn't be compiled on this machine. I'm deducing from the description that the Perl implementations aren't really intended to be that efficient.	[reply]
Re^6: Remove Duplicates from Array by JadeNB (Chaplain) on Nov 03, 2008 at 15:22 UTC
I agree with you, in the sense that I'm not surprised if the pure-Perl `List::MoreUtils::uniq` doesn't blow a naïve implementation out of the water, speed-wise; but, like dragonchild, I thought that at least by using these functions one should get handling for unusual arrays, such as those containing ~~arrays~~ references—and this code doesn't provide that (or, at least, doesn't do so any better than Justin's original implementation). UPDATE: A conversation with dragonchild showed that my meaning was a little unclear, so let me try again. I realise that the existing code does correctly return objects, rather than their stringified versions, but it seems to me that it could incorrectly confuse (say) the string "`ARRAY(0x18045c0)`" with an arrayref.	[reply] [d/l] [select]