Hmm, does this represent everything the program needs to do? Because if, so, I note that you only ever use the first field of each line - you can load the first field of all the lines in both files into memory, and avoid the painfully slow re-reading; something like (untested):
my $hold = readfile('holds'); my $copy = readfile('copies'); for my $key (keys %$hold) { if ($copy{$key}) { print "$key: hold and copy (or copy and hold)\n"; } } sub readfile { my $file = shift; my $hash = {}; open(my $fh, "<$file") or die "$file: $!"; local $_; while (<$fh>) { # fields are '|' delimited - pick up the first field my $key = substr $_, 0, index($_, "|"); ++$hash->{$key}; } close $fh; return $hash; }
Even if this is only the starting point, and the real code needs to access all the fields, you could for example cache in memory the first field and the offset into the file for each row, and then use seek() to locate the complete record whenever you need it.
Hugo
In reply to Re: Re: Re: Re: many to many join on text files
by hv
in thread many to many join on text files
by aquarium
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |