Re: combining 2 files with a comon field

Hm, how about a hash per file and combine them on write?

use strict;
use warnings;

open FILE1, '<', 'file1.txt' or die ($!);
open FILE2, '<', 'file2.txt' or die ($!);

#- my %file1 = map { split '\|', $_ } <FILE1>;
my %file1 = map { chomp && s/\|$//g && split '\|', $_, 2 } <FILE1>;
#- my %file2 = map { split '\|', $_ } <FILE2>;
my %file2 = map { chomp && s/\|$//g && split '\|', $_, 2 } <FILE2>;
## we now have A1=>'dog' in one hash, and A1=>'Fido' in the other

close FILE1;
close FILE2;

open FILE3, '>', 'file3.txt' or die ($!);
for (sort keys %file1) {
   print FILE3 join('|',$_,$file1{$_},$file2{$_}),'|',"\n";
}
close FILE3;
[download]

untested

Simply put, create a hash "map" of each file, then find where the keys intersect and print out the result.

Caveats:

If there is no key in %file1, you won't get a result
If there is no key in %file2, you'll get a warning about printing an undefined value.
this makes assumptions about file formats
Update:Files that are not well-formed will cause problems -- for this and other reasons, there needs to be better error-checking.

None of those are unconquerable, but are some things to consider if you're taking the idea for production code. Update: modified the code based on thread below. Comments '#-' are old lines. A better thing to do than cheat with the file slurp might be something like:

my %file1;
while (<FILE1>) {
  chomp; s/\|[\s]*$//;
  my ($key, $val) = split '\|', $_, 2;
  $file1{$key} = $val;
}
[download]

Of course, that's not nearly as fun...

The Eightfold Path: 'use warnings;', 'use strict;', 'use diagnostics;', perltidy, CGI or CGI::Simple, try the CPAN first, big modules and small scripts, test first.

Comment on Re: combining 2 files with a comon field Select or Download Code

Replies are listed 'Best First'.
Re^2: combining 2 files with a comon field by jjohhn (Scribe) on May 18, 2005 at 16:18 UTC
I tried a variation of your suggestion, with an added print debug line: `use strict; use warnings; open FILE1, '<', 'file1' or die ($!); my %file1 = map { split '\\|', $_ } <FILE1>; ## we now have A1=>'dog' in hash close FILE1; for(sort keys %file1){ print "$_\n"; print join('\|',$_,$file1{$_}),'\|',"\n"; }` [download] and got: Odd number of elements in hash assignment at combine2.pl line 6, <FILE1> line 3. Use of uninitialized value in join or string at combine2.pl line 13. \|\| A1 A1\|dog\| A3 A3\|bird\| cat cat\| \| file1 is: A1\|dog\| A2\|cat\| A3\|bird\|	[reply] [d/l]
Re^3: combining 2 files with a comon field by radiantmatrix (Parson) on May 18, 2005 at 20:42 UTC
Well, I did say it was untested. ;-) `# my %file1 = map { split '\\|', $_ } <FILE1>; my %file1 = map { chomp && s/\\|$//g && split '\\|', $_, 2 } <FILE1>;` [download] Do make the same changes to the %file2 hash statement, too, in the original code. This clears the \| and newline at the end of each file line before splitting, and limits the split to two parts. Hope that helps! BTW, this is a good example of how using warnings and strict point out where the bugs are. I knew what the issue was as soon as I saw those two warning statements. ;-) Also, make sure that your file is well-formed: that is, it ends with a newline, or you might get interesting results. This should not be done in production without some better error control... The Eightfold Path: 'use warnings;', 'use strict;', 'use diagnostics;', perltidy, CGI or CGI::Simple, try the CPAN first, big modules and small scripts, test first.	[reply] [d/l]
Re^3: combining 2 files with a comon field by BerniBoy (Acolyte) on May 18, 2005 at 19:43 UTC
Hmm seems to me like the "map" statement also treats the newline characters after the last "\|" character and tries to insert them into the hash somehow. Just a guess from what i see here...	[reply]