in reply to Removing duplicate hashes based on only one key

my @new_array = do { my %bobs_your_uncle; grep !$bobs_your_uncle{$_->{bob}}++, @array; };

-- Randal L. Schwartz, Perl hacker

Replies are listed 'Best First'.
Re: Re: Removing duplicate hashes based on only one key
by dragonchild (Archbishop) on Sep 20, 2001 at 20:31 UTC
    I know why it works, but would you mind explaining (in your inimitable way) to the young'uns what's going on? :-)

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      grep walks through every element of the array, setting $_. He's also created a hash. As $_ points to the appropriate hash in the array, he grabs the value of the 'bob' field, and uses it as a key in the temporary hash. The negation and postincrement magic just make the expression within grep return a true value if this is the first occurrence of the key, and false if the key has reoccurred.

      Since grep returns a list of only those values for which its expression is true, it weeds out all of the duplicate elements.

      Once you understand how some of the more arcane operations (grep, map, sort) work on lists and how to manipulate list items within their expressions, you'll grok these tricks really easily.

      re-quoting code :

      my @new_array = do { my %bobs_your_uncle; grep !$bobs_your_uncle{$_->{bob}}++, @array; };

      Basically, the do{} block executes the code inside the braces; you could call a sub and get the same effect here. The last statement in the block is the grep which returns a list, which is what ends up in @new_array.

      The first line declares a hash. Nothing special about that. The magic's in the second line.

      grep CONDITION, LIST returns a list of the members of LIST that match the CONDITION. Here, the magic's almost all in the construction of the condition.

      $_->{bob} is the value associated with the "bob" key of the current element of the array (which is reference to a hash). e.g. if the current element of the array is (to put it visually),

      { bob=>'carol', alice=>"ted" }

      $_->{bob} is "carol". OK, so we ask the question: is $bobs_your_uncle{carol} true? Well, if it's the first time we've seen it, no. That value's undefined. So that test -- with the negation in front of it -- turns out *TRUE* the first time "carol" is seen as the value of "bob" in the array. The value is undefined, which is false; not-that is true.

      The ++ on the end of the condition says "OK, increment $bobs_your_uncle{carol} by 1 after you've performed the test", which means that, AFTER the element's been pushed or not pushed onto the result list, the value gets incremented.

      Bottom line: if this is the first time the grep "loop" has seen "carol", the undef gets converted to 0, and the value of $bobs_your_uncle{carol} is set to 1.

      Thus, the next time "carol" is seen, the condition evaluates to "false" (not-true: positive integer values are true in Perl), and the value is not pushed onto the result list.

      All of which goes to show how frickin' cool this language is.

      perl -e 'print "How sweet does a rose smell? "; chomp ($n = <STDIN>); +$rose = "smells sweet to degree $n"; *other_name = *rose; print "$oth +er_name\n"'