in reply to combining 2 files with a comon field

Hm, how about a hash per file and combine them on write?

use strict; use warnings; open FILE1, '<', 'file1.txt' or die ($!); open FILE2, '<', 'file2.txt' or die ($!); #- my %file1 = map { split '\|', $_ } <FILE1>; my %file1 = map { chomp && s/\|$//g && split '\|', $_, 2 } <FILE1>; #- my %file2 = map { split '\|', $_ } <FILE2>; my %file2 = map { chomp && s/\|$//g && split '\|', $_, 2 } <FILE2>; ## we now have A1=>'dog' in one hash, and A1=>'Fido' in the other close FILE1; close FILE2; open FILE3, '>', 'file3.txt' or die ($!); for (sort keys %file1) { print FILE3 join('|',$_,$file1{$_},$file2{$_}),'|',"\n"; } close FILE3;
untested

Simply put, create a hash "map" of each file, then find where the keys intersect and print out the result.

Caveats:

None of those are unconquerable, but are some things to consider if you're taking the idea for production code. Update: modified the code based on thread below. Comments '#-' are old lines. A better thing to do than cheat with the file slurp might be something like:
my %file1; while (<FILE1>) { chomp; s/\|[\s]*$//; my ($key, $val) = split '\|', $_, 2; $file1{$key} = $val; }
Of course, that's not nearly as fun...

The Eightfold Path: 'use warnings;', 'use strict;', 'use diagnostics;', perltidy, CGI or CGI::Simple, try the CPAN first, big modules and small scripts, test first.

Replies are listed 'Best First'.
Re^2: combining 2 files with a comon field
by jjohhn (Scribe) on May 18, 2005 at 16:18 UTC
    I tried a variation of your suggestion, with an added print debug line:
    use strict; use warnings; open FILE1, '<', 'file1' or die ($!); my %file1 = map { split '\|', $_ } <FILE1>; ## we now have A1=>'dog' in hash close FILE1; for(sort keys %file1){ print "$_\n"; print join('|',$_,$file1{$_}),'|',"\n"; }
    and got:
    Odd number of elements in hash assignment at combine2.pl line 6, <FILE1> line 3.
    Use of uninitialized value in join or string at combine2.pl line 13.

    ||
    A1
    A1|dog|
    A3
    A3|bird|
    cat
    cat|
    |

    file1 is:
    A1|dog|
    A2|cat|
    A3|bird|

      Well, I did say it was untested. ;-)

      # my %file1 = map { split '\|', $_ } <FILE1>; my %file1 = map { chomp && s/\|$//g && split '\|', $_, 2 } <FILE1>;
      Do make the same changes to the %file2 hash statement, too, in the original code. This clears the | and newline at the end of each file line before splitting, and limits the split to two parts. Hope that helps!

      BTW, this is a good example of how using warnings and strict point out where the bugs are. I knew what the issue was as soon as I saw those two warning statements. ;-) Also, make sure that your file is well-formed: that is, it ends with a newline, or you might get interesting results.

      This should *not* be done in production without some better error control...


      The Eightfold Path: 'use warnings;', 'use strict;', 'use diagnostics;', perltidy, CGI or CGI::Simple, try the CPAN first, big modules and small scripts, test first.

      Hmm seems to me like the "map" statement also treats the newline characters after the last "|" character and tries to insert them into the hash somehow.

      Just a guess from what i see here...