in reply to Re: Unique Values within AOH
in thread Unique Values within AOH

unless $seen{"$team$player"}++;

One small nit. Because the  %seen uniqification hash is common to all teams, it's possible to confuse certain team/player records, e.g.:
    Team    Player
    ABC     DEFGH
    ABCD    EFGH
which both become the key 'ABCDEFGH'.

This is easily avoided by joining the two strings with some character or character sequence that (you hope!) cannot possibly occur in team or player names:
    unless $seen{"$team\x00$player"}++;
(Update: Actually, for the given order of concatenation, it's only necessary that the separator character or character sequence cannot appear in the team name.)

This nit is very unlikely to bite, but may be very difficult to debug (or even see in large data sets) if it does.

Update: Another, possibly more significant nit. All team and player name data in the OPed example is uppercase. If there may be any mixing of case, then, e.g., 'Rose' will be distinct from 'ROSE' and de-duplication may fail. In this case, or even as a general precaution, team/player names can be common-cased:
    unless $seen{"\U$team\x00$player"}++;
See Quote and Quote-like Operators in perlop for  \U \L et al.


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^3: Unique Values within AOH (updated)
by hippo (Archbishop) on Oct 30, 2019 at 23:56 UTC
    This is easily avoided by joining the two strings with some character or character sequence that (you hope!) cannot possibly occur in team or player names

    Or, take the guesswork out of it and add an extra layer of depth to the hash:

    unless $seen{$team}{$player}++;
    Another, possibly more significant nit. All team and player name data in the OPed example is uppercase. If there may be any mixing of case ...

    True, but to be fair to the OP it was stated to be just an example. I'm not convinced that dirtdog is actually applying this to teams and players. If he were then he would have bigger problems with real-world data such as the current All Blacks XV which has featured all three Barretts in recent weeks. You can't go de-duplicating three different players who share the same surname.

      ... add an extra layer of depth to the hash ...

      The best solution, I think.

      ... the [data] was stated to be just an example.

      True, but one can only address the circumstances before one. As you say, the whole consideration may turn out to be irrelevant.

      ... three different players who share the same surname.
      OT: I recently read (it might even have been here on PM) of a sports team somewhere with two players with the same surname and same given name who were playing in the same game, and one guy replaced the other! De-duplicate that!


      Give a man a fish:  <%-{-{-{-<