Well, it doesn't have to lose any data you need to keep - I originally wrote it as $hash->{$key} = 1, and changed it to an increment to preserve the count; it could just as easily store an arrayref of line numbers or file positions for each key. The core concept though is that the primary super-slow operation being performed is searching through a file repeatedly for matches on a key, and if there is any way to turn that into a hash lookup it's likely to improve things as long as there is memory left.
I think we'd need to see some information about the actual task at hand to know whether there's some reason an in-core hash cannot be used, but unless memory is unusually restricted for a modern-day computer it seems unlikely.
Hugo
In reply to Re^6: many to many join on text files
by hv
in thread many to many join on text files
by aquarium
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |