in reply to Re^5: Filtering very large files using Tie::File
in thread Filtering very large files using Tie::File

Actually, no references come into play here. I'm simply incrementing the undefined value of $seen{ $key } by (and to) 1.

Replies are listed 'Best First'.
Re^7: Filtering very large files using Tie::File
by elef (Friar) on Nov 26, 2010 at 21:38 UTC
    Ahh, so every key/value pair in the hash is (text from the file)/1? Or, I guess, (text)/(number of occurrences) because if the record is a duplicate, you'll be incrementing a preexisting value. I think I get it now.

      The first time you see the line:

      • $seen{"abc\n"} doesn't exist, so it's effectively undef.
      • $seen{"abc\n"}++ increments $seen{"abc\n"} to 1 and returns the original value (undef).
      • "!" negates the value returned by the postincrement (undef), returning true.
      • The "if" body is entered.

      The second (or third, or fourth) time you see the line:

      • $seen{"abc\n"} was previously set to 1 (or 2, or 3).
      • $seen{"abc\n"}++ increments $seen{"abc\n"} to 2 (or 3, or 4) and returns the original value (1, or 2, or 3).
      • "!" negates the value returned by the postincrement (1, or 2 or 3), returning false.
      • The "if" body is not entered.

      It's just one of those useful patterns one memorises.