in reply to Re^4: Filtering very large files using Tie::File
in thread Filtering very large files using Tie::File

the perlop page says nothing about adding new records to a hash with ++
This commonly seen Perl idiom works due to Autovivification (the automatic creation of a variable reference when an undefined value is dereferenced). Autovivification is unique to Perl; in other languages you'd need to create the item as a separate operation before incrementing it.

  • Comment on Re^5: Filtering very large files using Tie::File

Replies are listed 'Best First'.
Re^6: Filtering very large files using Tie::File
by Corion (Patriarch) on Nov 26, 2010 at 20:56 UTC

    Actually, no references come into play here. I'm simply incrementing the undefined value of $seen{ $key } by (and to) 1.

      Ahh, so every key/value pair in the hash is (text from the file)/1? Or, I guess, (text)/(number of occurrences) because if the record is a duplicate, you'll be incrementing a preexisting value. I think I get it now.

        The first time you see the line:

        • $seen{"abc\n"} doesn't exist, so it's effectively undef.
        • $seen{"abc\n"}++ increments $seen{"abc\n"} to 1 and returns the original value (undef).
        • "!" negates the value returned by the postincrement (undef), returning true.
        • The "if" body is entered.

        The second (or third, or fourth) time you see the line:

        • $seen{"abc\n"} was previously set to 1 (or 2, or 3).
        • $seen{"abc\n"}++ increments $seen{"abc\n"} to 2 (or 3, or 4) and returns the original value (1, or 2, or 3).
        • "!" negates the value returned by the postincrement (1, or 2 or 3), returning false.
        • The "if" body is not entered.

        It's just one of those useful patterns one memorises.