I'm working with a hash containing approx 16 million key-value pairs; keys are short strings, values are integers.
When the hash is being generated I see memory usage rise to about 3.1GB. I have tried several ways to (sort) and print the hash but all of them seem to increase memory usage significantly. If I could avoid this increase then obviously I could work with a much bigger hash. Thanks in advance for any suggestions, or explanations.
Here's what I've tried...
print %hash;This added 1.6GB to memory usage and took about 6 minutes to execute, but gives unhelpful output format.
my @keys = sort { $hash{$b} <=> $hash{$a} } keys %hash; foreach my $key (@keys) {print "$key\t$hash{$key}\n";}
This added 2GB to memory usage and took about 7 minutes, giving perfect sorted output for me.
print map { "$_ $summedNgrams{$_}\n" } keys %summedNgrams;This added 3GB to memory usage and took about 8 minutes to give readable but unsorted output.
Running Perl on Cygwin64, Windows7. Timings approximate. Memory usage observed in Windows Task Manager.
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |