rose has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I know the following code is Remove the duplicate elements from the array.

my @result = grep { !$seen{$_}++ } @a;

Can anyone explain this code and how its working.

Thanks
Rose

Replies are listed 'Best First'.
Re: Explain the code
by shmem (Chancellor) on Jul 24, 2008 at 13:11 UTC
    Can anyone explain this code and how its working.

    grep applies the block { !$seen{$_}++ } on each element of @a and depending on the result of that application, skips the element or returns it. Inside the block the current element from @a is in $_. The block returns the negation of the value stored under the key $_ in the hash %seen.

    Inside the block, in the hash %seen, the value keyed on $_ is at first undefined; it is incremented - but with a post-increment, so on the first occurrence of the hash key stored in $_, the negation (!$seen{$_}) of that undefined (false) value is 1 (or true). So,on the first occurrence the block returns a true value, and the currently evaluated element from @a passes the grep. On the next occurrence of the same hash key the value is already set to a positive integer, which when negated results in a false value - the element doesn't pass the grep.

    Yes, I know, the above text needs to be read twice...

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      Yes, I know, the above text needs to be read twice...

      yes, it does --- but it is well worth the read! nicely done shmem++

Re: Explain the code
by FunkyMonk (Bishop) on Jul 24, 2008 at 13:13 UTC
    Equivalent code, with comments
    use Data::Dump 'pp'; my @a = qw/2 1 3 4 5 4 3 3 4 5/; my %seen; my @result; for ( @a ) { #for each element in @a if ( not $seen{$_} ) { #if the element hasn't already been seen push @result, $_; #add it to the result set } $seen{$_}++; #note that the element's been seen } print join(",", @result), "\n"; pp \%seen; #{ 1 => 1, 2 => 1, 3 => 3, 4 => 3, 5 => 2 }

    Each value in %seen is a count of how many times the key's been seen

    Update: Removed a misleading comment & corrected a typo
    Update^2: replaced say with print for perl<5.10 compatability (as suggested by ww)


    Unless I state otherwise, all my code runs with strict and warnings
Re: Explain the code
by gopalr (Priest) on Jul 24, 2008 at 12:54 UTC

    This approach is merges the construction of the %seen hash with the extraction of unique elements.

    1. reading elements from @array

    2. If the current element ($_) and it is not in the hash (%seen), it will push into @result. Then it stored into hash(%seen). As we know hash will not allow duplicate value

    Thanks
    Gopal

      %seen will also have the number of occurrences of every element of @a.

        How? If the array (@a) has duplicate values, hash will eliminate that.

Re: Explain the code
by apl (Monsignor) on Jul 24, 2008 at 13:20 UTC
    Besides reading up on grep, you might want to take a look at this write-up on Perl idioms. It gives a very short, but very clear, description of how grep and map are used.
Re: Explain the code
by Pancho (Pilgrim) on Jul 24, 2008 at 13:01 UTC
Re: Explain the code
by linuxer (Curate) on Jul 28, 2008 at 21:08 UTC
Re: Explain the code
by CaMelRyder (Pilgrim) on Jul 29, 2008 at 16:11 UTC
    my @result = grep { !$seen{$_}++ } @a;
    The grep function takes 2 parameters. The first is an expression to be executed for each element in the list. The second is the list to iterate over.

    grep then foreachs through the list @a and for every element it sets $_ = that element.

    Then it evaluates the expression !$seen{$_}++. This takes the current entry and does one of 2 things:

    1) if there is no entry in the %seen hash then it puts a zero in there. So the $seen{$_} evaluates to (0)false. Add the ! negation operator and you can see that the expression passed to grep only evaluates to true for values on their first encounter. Furthermore, the ++ is post-increment so it happens after expression returns it's value.

    2) if we have seen it, increment the entry. This causes the expression to be simplified down to the negation of a positive number, which is false.

    grep will just will just run through it's list checking the expression against each element. The elements that evaluate to true are returned as @result.

    Nice use of grep. You should see some of my legacy code for dedupping a list!!

    ¥peace from CaMelRyder¥