in reply to Re^3: Hash w/ multiple values + merging
in thread Hash w/ multiple values + merging

To avoid using "real" files for demonstration purposes I used Perl's facility for using a string as a file by passing a reference to the string into the open. To open a real file instead you should:

open my $in, '<', $fileName or die "Unable to open $filename: $!";

Please follow my advice and use strictures (use strict; use warnings; - see The strictures, according to Seuss), the three parameter form of open and lexical file handles (the 'my $in' bit in my sample code). These tips will save you time in the future!

If you open an output file ($out) before the loop in my sample code you can change the print to:

printf $out $format, $key, @{$data{$key}}{@columnNames};

to print to the output file instead of STDOUT. Note that for testing and sample code using STDOUT is often much more convenient!


True laziness is hard work

Replies are listed 'Best First'.
Re^5: Hash w/ multiple values + merging
by sophix (Sexton) on Feb 08, 2010 at 01:08 UTC
    My mistake. It now works beautifully. May I ask for a last favor, though? I want to keep the header -- first line in the first file. I looked at the code to see if I can find where to skip reading the first line, but I could not figure out. So the working code written by GrandFather:
    #!/usr/bin/perl use strict; use warnings; my $data1 = "/PRBB/input.txt"; my $data2 = "/PRBB/input2.txt"; my $data3 = "/PRBB/output.txt"; my %data; my @columnNames; #open my $in, '<', \$data1; open my $in, '<', $data1 or die "Unable to open $data1: $!"; push @columnNames, parseFile (\%data, $in); close $in; #open $in, '<', \$data2; open my $in2, '<', $data2 or die "Unable to open $data2: $!"; push @columnNames, parseFile (\%data, $in2); close $in2; my $format = (('%-9s ') x (@columnNames + 1)) . "\n"; open my $out, '>', $data3 or die "Unable to open $data3: $!"; for my $key (sort keys %data) { next if keys %{$data{$key}} != @columnNames; printf $out $format, $key, @{$data{$key}}{@columnNames}; } sub parseFile { my ($dataRef, $inFile) = @_; my $header = <$inFile>; my ($keyColumn, @columns) = map {chomp; split} $header; while (defined (my $line = <$inFile>)) { chomp $line; my ($key, @data) = split /\s+/, $line; @{$dataRef->{$key}}{@columns} = @data; } return @columns; }

      Hint: look for a variable called '$header'!

      It doesn't seem likely that you do actually want that header though - it doesn't include column titles for the extra data from the other file.


      True laziness is hard work
        I thought to play with the $header, but would not it cause problems? If I remove it, then the whole script would treat it just another key-value pair, and since it does not have a match on the second file, it will be dropped. Yes, you are right! However, the (real) second file does not include any column titles at all!
        file 1: id val1 val2 val3 452 sdfdf sfgfdg asfa 154 afa afafe rghreh 161 aafa gte fdhd file 2: (one empty line - no header) 452 kmfgkmg 213 adfadfa 997 afdafa 161 adaadaada
        so the result file should capture the column titles from the first file while the second file acts only as to make possible to match and then capture the relevant new value.
        resulting file: id val1 val2 val3 452 sdfdf sfgfdg asfa kmfgkmg 161 aafa gte fdhd adaadaada
        while of course i can introduce a header to the second file as such id val4
        I have not still found a way to keep the header ($header). The resulting file did not include the second line from the first file (right under the header). And there is "!" sign at the beginning of the first line. I do not know yet what is the trick exactly, but it worked when I removed that exclamation mark. (It is great that I learn something new (though, not thoroughly) every time I compile and try to run the code =)) Nevertheless, I have not still managed to keep the header. I would appreciate any insight on how to modify the $header. PS. By the way, I tried to filled in the first line of the second file (which was originally empty) as such its first element (key) matches the key from the first file... but, nope, it did not work. Here is the code:
        #!/usr/bin/perl -w use strict; my $data1 = "/PRBB/Data1.expr"; my $data2 = "/PRBB/Data2.acc"; my $data3 = "/PRBB/onuralp.txt"; my (%data, @columnNames); open my $in, '<', $data1 or die "Unable to open $data1: $!"; push @columnNames, parseFile (\%data, $in); close $in; open my $in2, '<', $data2 or die "Unable to open $data2: $!"; push @columnNames, parseFile (\%data, $in2); close $in2; my $format = (('%-9s ') x (@columnNames + 1)) . "\n"; open my $out, '>', $data3 or die "Unable to open $data3: $!"; for my $key (sort keys %data) { next if keys %{$data{$key}} != @columnNames; printf $out $format, $key, @{$data{$key}}{@columnNames}; } sub parseFile { my ($dataRef, $inFile) = @_; my $header = <$inFile>; my ($keyColumn, @columns) = map {chomp; split} $header; while (defined (my $line = <$inFile>)) { chomp $line; my ($key, @data) = split /\s+/, $line; @{$dataRef->{$key}}{@columns} = @data; } return @columns; }
Re^5: Hash w/ multiple values + merging
by sophix (Sexton) on Feb 08, 2010 at 00:41 UTC
    Thanks a lot, GrandFather. Now I get the following errors while it does not print out anything to the file. "my" variable $in masks earlier declaration in same scope (line 19) use of uninitialized value $key in hash element (line26) use of uninitialized value $key in printf (line26)
    #!/usr/bin/perl use strict; use warnings; my $data1 = "/DATA/input.txt"; my $data2 = "/DATA/input2.txt"; my $data3 = "/DATA/output.txt"; my %data; my @columnNames; my $key; [line 19]open my $in, '<', $data1 or die "Unable to open $data1: $!"; push @columnNames, parseFile (\%data, $in); close $in; open my $in, '<', $data2 or die "Unable to open $data2: $!"; push @columnNames, parseFile (\%data, $in); close $in; my $format = (('%-9s ') x (@columnNames + 1)) . "\n"; open my $out, '>', $data3 or die "Unable to open $data3: $!"; (line26)printf $out $format, $key, @{$data{$key}}{@columnNames}; for my $key (sort keys %data) { next if keys %{$data{$key}} != @columnNames; printf $format, $key, @{$data{$key}}{@columnNames}; } sub parseFile { my ($dataRef, $inFile) = @_; my $header = <$inFile>; my ($keyColumn, @columns) = map {chomp; split} $header; while (defined (my $line = <$inFile>)) { chomp $line; my ($key, @data) = split /\s+/, $line; @{$dataRef->{$key}}{@columns} = @data; } return @columns; }
    It prints out the output on dos-screen, though. (but not into file)