pglenski has asked for the wisdom of the Perl Monks concerning the following question:

I've browsed thru history but I can't quite figure out this code:
@list = qw/paul jane paul betty paul jane/; %temp = (); @list = grep ++$temp{$_} < 2, @list;
It removes dups just fine but I can't figure out what the "++" is doing. Can you explain what's going on in a step-by-step fashion?

2006-03-31 Retitled by planetscape, as per Monastery guidelines
Original title: 'grep'

Replies are listed 'Best First'.
Re: explain use of increment operator in grep
by gaal (Parson) on Mar 30, 2006 at 23:19 UTC
    Suppose you have a hash:

    %seen = ( paul => 1, jane => 1, betty => 1, );

    The values in the hash aren't very important; the idea is that you can get the keys very easily:

    print for keys %seen; # will print paul, jane, betty (though not necessarily in this order, +see note below)

    Now, suppose I come and try to insert an element into %seen for which a key already exists. For example, I do

    $seen{paul} = 1; # this does absolutely nothing! but... ++$seen{paul}; # paul's *value* is 2 now print for keys %seen; # yet this still prints the same list

    The one-liner you're asking about uses this trick. The grep only passes keys in a hash for which the corresponing values is "< 2", but the increment is done before the lesser-than comparison. That is, the first time a particular key is encountered, $thatkey => 1 is effectively put into the hash.

    Every subsequent visit with a particular key will cause the value to rise by one, but that also means the condition "< 2" will fail. So the effect is that duplicates are thrown away. A side effect of this technique is that %seen (or %temp, to use your name for it) holds the count of actually seen repetitions for each key!

    Note that hashes in Perl are unordered, so that a simple "print keys" does not preserve the original relative order of elements. However, the code you brought does preserve ordering because the grep goes through elements of the original list one by one in order.

    Golf note:

    @list = grep !$temp{$_}++, @list; # same thing, shorter.

    Update: fixed logic bug in last line. Thanks, Anonymonk.

      That should be:
      @list = grep !$temp{$_}++, @list; # same thing, shorter.
        Indeed it should! Good catch.
Re: explain use of increment operator in grep
by graff (Chancellor) on Mar 31, 2006 at 03:26 UTC
    grep will only return those items from @list that satisfy the given condition. In this case, the grep condition is doing two things: it is incrementing an element in the %temp hash keyed by the element from @list, and it is testing to see if the value of the hash element is less than 2.

    The increment is '++$temp{$_}', which means that the hash value is incremented before the "less than 2" test is performed (as opposed to '$temp{$_}++', which would increment after the test is done).

    So, the first time a given string is encounted in @list, the resulting hash value satisfies the condition (incremented, it is less than 2). Each time the given string is encountered again, the increment puts the hash value above 1, so that the "less than 2" condition will fail, and grep will exclude that list item from the set that it returns.