Now, as far as I can tell, the best way to do things like this in large files is Tie::File, which loads files into a pseudo-array and helps do operations like this without loading the entire file to memory.
The basic means of reading from a file (read, readline) do not read the entire file into memory, so that aspect of Tie::File is not special. What you are doing by using Tie::File is wasting time and memory for features you don't even need.
I'm not sure if the foreach my $list( sort( @lists ) ){ line means that I will get alphabetically sorted output at the end, but I suspect it does, which wouldn't be ideal.
sort @tied would cause all of the file to be loaded into memory so it can be passed to sort. Thankfully, you're not passing the tied array there, but that probably means you are doing remove_duplicate_from_array(@tied).
remove_duplicate_from_array(@tied) would cause all of the file to be loaded into memory so it can be placed in @_. Then the first thing you do in remove_duplicate_from_array is to create a copy of @_
ouch.
In reply to Re: Filtering very large files using Tie::File
by ikegami
in thread Filtering very large files using Tie::File
by elef
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |