The hash approach will work, but it will be memory bound if the files grow enormous. Anyway, if I were to take the hash approach, here's one way I might do it:
use strict; use warnings; my %indices; open my $primary, '<', 'filename.txt' or die $! while ( my $line = <$primary> ) { my $key = ( split /,/, $line )[0]; # The following line is wrong. # $indices{$line} = 0; # Here's the correct line... $indices{$key} = 0; } close $primary; open my $secondary, '<', 'filename2.txt' or die $!; while ( my $line = <$secondary> ) { my $key = ( split /,/, $line )[0]; if( exists $indices{$key} ) { $indices{$key}++; } } close $secondary; foreach( keys %indices ) { if( $indices{$_} > 0 ) { print "$_ from the first file was found ", $indices{$_}, " times in the second file.\n"; } }
That's one way to do it. If your files are going to grow big enough for memory to become a concern you would need an approach that doesn't attempt to hold the whole index in memory at once. A lightweight database like SQLite could be helpful in that regard.
Dave
In reply to Re: Matching hashes
by davido
in thread Matching hashes
by ada
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |