in reply to Matching hashes
The hash approach will work, but it will be memory bound if the files grow enormous. Anyway, if I were to take the hash approach, here's one way I might do it:
use strict; use warnings; my %indices; open my $primary, '<', 'filename.txt' or die $! while ( my $line = <$primary> ) { my $key = ( split /,/, $line )[0]; # The following line is wrong. # $indices{$line} = 0; # Here's the correct line... $indices{$key} = 0; } close $primary; open my $secondary, '<', 'filename2.txt' or die $!; while ( my $line = <$secondary> ) { my $key = ( split /,/, $line )[0]; if( exists $indices{$key} ) { $indices{$key}++; } } close $secondary; foreach( keys %indices ) { if( $indices{$_} > 0 ) { print "$_ from the first file was found ", $indices{$_}, " times in the second file.\n"; } }
That's one way to do it. If your files are going to grow big enough for memory to become a concern you would need an approach that doesn't attempt to hold the whole index in memory at once. A lightweight database like SQLite could be helpful in that regard.
Dave
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Matching hashes
by ada (Novice) on Dec 05, 2007 at 19:51 UTC | |
|
Re^2: Matching hashes
by ada (Novice) on Dec 05, 2007 at 20:06 UTC | |
|
Re^2: Matching hashes
by ada (Novice) on Dec 05, 2007 at 18:12 UTC | |
by davido (Cardinal) on Dec 05, 2007 at 18:42 UTC | |
|