in reply to Re^3: Filtering very large files using Tie::File
in thread Filtering very large files using Tie::File

Thanks, that clears up most things.
I did know how post-increment works on numerical scalars, but this sort of use is new to me, and the perlop page says nothing about adding new records to a hash with ++... But this is what it seems to do.

Your code works a treat, and the memory use doesn't seem to be too bad. Thanks.
  • Comment on Re^4: Filtering very large files using Tie::File

Replies are listed 'Best First'.
Re^5: Filtering very large files using Tie::File
by eyepopslikeamosquito (Archbishop) on Nov 26, 2010 at 20:41 UTC

    the perlop page says nothing about adding new records to a hash with ++
    This commonly seen Perl idiom works due to Autovivification (the automatic creation of a variable reference when an undefined value is dereferenced). Autovivification is unique to Perl; in other languages you'd need to create the item as a separate operation before incrementing it.

      Actually, no references come into play here. I'm simply incrementing the undefined value of $seen{ $key } by (and to) 1.

        Ahh, so every key/value pair in the hash is (text from the file)/1? Or, I guess, (text)/(number of occurrences) because if the record is a duplicate, you'll be incrementing a preexisting value. I think I get it now.