in reply to combining 2 files with a comon field

Use a hash, for example a hash of arrays, and fill it with the two files. Finally, print it out.
#! perl -w my %data; open IN, "file1.txt" or die "Ugh! $!"; while(<IN>) { chomp; my($key, $value) = split /\|/ or next; $data{$key}[0] = $value; } open IN, "file2.txt" or die "Ugh! $!"; while(<IN>) { chomp; my($key, $value) = split /\|/ or next; $data{$key}[1] = $value; } { open OUT, ">file3.txt" or die "Ugh! $!"; local($\, $,) = ("|\n", "|"); local $^W; # avoid "use of uninitialized value" foreach (sort keys %data) { print OUT $_, @{$data{$_}}[0, 1]; } }

The "or next" is to skip any empty lines in the input files. Disabling warnings in the printout is done to ignore warnings on any partly incomplete records.

Replies are listed 'Best First'.
Re^2: combining 2 files with a comon field
by jjohhn (Scribe) on May 18, 2005 at 11:47 UTC
    When I tried this (before seeing your post), I recieved the error:
    syntax error at C:\scripts\combineCols.pl line 10, near ") {" Can't use global $_ in "my" at C:\scripts\combineCols.pl line 12, near "= $_
    Do I have to hard code the names of the files?
    use strict; my %hash; while(<>){ (my $first, my $second) = split("|",$_); $hash{$first} = $second; } my $second while(<>) { my $line = $_ (my $first, $second) = split("|", $line); } foreach my $key (keys %hash){ my @list = ($hash{$key}, $second); $joined = "@list"; $hash{$key} = $joined; }
      You forgot a semicolon on the line
      my $second
      oh, and on the line
      my $line = $_
      too.

      That would solve your immediate syntax problem. But it doesn't solve the semantic problem: that it doesn't do what you want. For example, there is no connection between your value for $second and your hash key. That connection is in the value for $first. It'd work somewhat better if you incorporated your final loop body (but without the loop) into the one reading the second file. And you're making the classic newbie error of not backwhacking the "|" for split — and it's easier to use a regex for split, otherwise you'd even have to double the backslash.

      while(<>) { my $line = $_; my($first, $second) = split(/\|/, $line); # or: "\\|" my @list = ($hash{$first}, $second); $joined = "@list"; # joins with space by default (see $" ) $hash{$first} = $joined; }
      Missing semicolon after:

      my second

      and after

      my line = $_

      why not just:

      (my $first, $second) = split/\|/;