in reply to Comparing two files
Problem Statement: Take the values from two files and find those values that are the same. Print those to a third file.
Problem Solution: Since it seems that you just want to find all values in file 2 that exist in file 1, you don't really care if there are duplicates in either file or not. Existence it the important thing. So, it sounds like hashes are your friend here.
By looking at things this way, you can then easily find out which in file 2 aren't in file 1.my @file1 = <FILE1>; my @file2 = <FILE2>; # Here is where you would normalize the data. Things like uc, lc, ucfi +rst, s/\s//g, and the like. my (%file1, %file2); $file1{$_} = 1 foreach @file1; $file2{$_} = 1 foreach @file2; foreach my $value (sort keys %file1) { if ($file2{$value}) { print FILE3 "$value\n"; } }
Also, this lends itself to a counting of the instances, if you expect duplicates and you care. You can change the hash populating part to# Using the same data structures as above ... foreach my $value (sort keys %file2) { unless ($file1{$value}) { print FILE3 "$value\n"; } }
You could then do a <code>print FILE3 "$value : $file1{$value}\n";my (%file1, %file2); $file1{$_}++ foreach @file1; $file2{$_}++ foreach @file2;
------
We are the carpenters and bricklayers of the Information Age.
Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Comparing two files
by bman (Sexton) on Sep 10, 2001 at 19:17 UTC | |
by dragonchild (Archbishop) on Sep 10, 2001 at 19:39 UTC | |
by bman (Sexton) on Sep 13, 2001 at 16:23 UTC | |
by dragonchild (Archbishop) on Sep 13, 2001 at 16:57 UTC | |
by bman (Sexton) on Sep 13, 2001 at 17:33 UTC |