in reply to Re^4: Advice for optimizing lookup speed in gigantic hashes
in thread Advice for optimizing lookup speed in gigantic hashes
Try this:
#! perl -slw use strict; use threads; my @words = do{ local( @ARGV, $/ ) = 'your.dictionary'; <> }; chomp @words; my %dict; undef @dict{ @words }; my $errors = 0; my $savedfh; my @files = glob "*"; open $savedfh, '<', shift( @files ) or die $!; for my $file ( @files ) { my $fh = $savedfh; my $thread = async{ open my $fh, '<', $file or die $!; $fh }; map +( exists $dict{$_} || ++$errors ), split while <$fh>; $savedfh = $thread->join; } print $errors;
There are three attempted optimisations going on here:
It also allows strict & warnings which I prefer.
This is probably where most of your time is going. The initial lookup of a file, especially if it is in a large directory is often quite expensive in terms of time.
This will only be effective if you are processing a few large files rather than zillions of small ones.
Try it and see what if any benefit you derive. If it is effective and you want to understand more, just ask.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^6: Advice for optimizing lookup speed in gigantic hashes
by tobek (Novice) on Aug 23, 2011 at 14:41 UTC | |
by BrowserUk (Patriarch) on Aug 23, 2011 at 15:00 UTC |