in reply to Re: Merging/Rearranging Tables
in thread Merging/Rearranging Tables

Thank you a lot! It looks like ready solution for me!
I still have some problems with Hashes (but I am learning...), therefore, I still have a question:

-----------
How to fill HoA with the values? Will the following code work for me?

# define files containing tables (tab-delimited text files) my @files = ['tab1.txt','tab2.txt']; # Define master hash "$ptables" my %ptables; #read files and add data to the HoA: foreach my $file (@files) { my @string_array = Read_File($file); # define arrays of strings my $hash_key =~ s/\.txt//; # generate hash key $ptables{$hash_key} = [ @string_array ]; # save all strings to the + hash }

Replies are listed 'Best First'.
Re^3: Merging/Rearranging Tables
by liverpole (Monsignor) on Feb 11, 2007 at 02:58 UTC
        "Thank you a lot!"

    You're welcome a lot!

        "Will the following code work for me?"

    Well, I'm tempted to ask -- "what happens when you try?".  The best way to learn is, after all, by trying.

    I do see that you're trying to initialize @files from a list reference ['tab1.txt','tab2.txt']; which likely won't do what you're expecting (you'll get a single item in @files, which itself is a reference to the 2-item list).

    Better to declare it like this:

    my @files = ('tab1.txt','tab2.txt'); # Or, using the "quote-word" function "qw", # which lets you omit the quotes and the comma: my @files = qw( tab1.txt tab2.txt );

    Furthermore, you're creating a variable $hash_key which you're never assigning to, but rather trying to perform a regex substitution on with:

    my $hash_key =~ s/\.txt//; # generate hash key

    That's why, when I run it with use strict; and use warnings; I get the error:

    Use of uninitialized value in substitution (s///) at merge.pl line 16.

    So I'm assuming what you want instead is to assign to the filename $file, and then perform the substitution to get the resulting hash key:

    (my $hash_key = $file) =~ s/\.txt//; # generate hash key

    A final thought:  make liberal use of Data::Dumper to see what data a given data structure contains at any time.  For example, to see the entire contents of $ptables after making each assignment:

    $ptables{$hash_key} = [ @string_array ]; # save all strings to the has +h print Dumper(\%ptables); # use "\" to pass reference o +f hash

    Update:  fixed typo (thanks johngg).


    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
      Well, you are right, it is better to try before to ask, but I thought maybe I did some principle mistake in data assignment to the hash structure that one can identify just by looking on it.

      Concerning @files and $hash_key inititalization - it was a mistake. I should have check it before, sorry.

      I'll try with the Data:Dumper.
      Thanks for your comments.

      Cheers,
      Evgeniy
        Finally,
        working version of the script:
        Input:
        Tab1.txt
        ID column | column 1 gene 1 | value 1.1 gene 2 | value 2.1 gene 4 | value 4.1 gene 8 | value 8.1

        Tab2.txt
        ID column | column 1 | column2 gene 1 | value 1.1 | n.a. gene 3 | value 3.1 | value 3.2 gene 4 | value 4.1 | value 4.2

        To run the script, provide it with file names as argumeents. Files should be in the same filder as a script.
        The table merging script is below:
        # Strict use strict; use warnings; # Libraries use Data::Dumper; #variables definition my (@filenames, @strings, @text); #read user defigned tables filenames; if(!$ARGV[1]){ die "Please provide with the at least 2 file names. Good luck!!!"; }; foreach my $element (@ARGV) { if ($element =~ /-help=/i) { print STDERR "Please provide with the at least 2 filenames\n"; + exit; } else { push (@filenames, $element); } } # Define master hash "$ptables" my $ptables; #read files and add data to the HoA: foreach my $file (@filenames) { my @string_array; #read input file and define arrays of strings open (FILE, "<$file") or die "$!"; while (<FILE>) { for my $chank (split /\n/) { push (@string_array, $chank);} } close (FILE); (my $hash_key = $file) =~ s/\.txt//; # generate hash key $ptables->{$hash_key} = [ @string_array ]; # save all strings to t +he hash print Dumper(\$ptables); undef @string_array; } # Globals my %output; my %ncolumns; my %values; my @tables = (sort keys %$ptables); # Get all table na +mes # Main program # First pass -- parse each table to fetch all the IDs print "=== Pass 1 ===\n"; foreach my $table (@tables) { my $ptab = $ptables->{$table}; # Assign to table my @rows = split(/\s*\|\s*/, shift @$ptab); # Get column headi +ngs shift @rows; # Discard "ID colu +mn" my $ncols = @rows; # Find number of c +olumns $ncolumns{$table} = $ncols; # Save # of column +s print "Reading $table; $ncols col(s)...\n"; # Announce table n +ame foreach my $line (@$ptab) { my ($id,@vals) = split(/\s*\|\s*/, $line); # Get ID and value +s $output{$id} ||= [ ]; # Placeholder for +ID $values{$table}{$id} = [ @vals ]; # Save values for +table/ID } } # Second pass -- process each ID, adding values from each table my @ids = (sort keys %output); print "=== Pass 2 ===\n"; foreach my $id (@ids) { print "Processing ID $id\n"; my $pout = $output{$id}; # Get current ID l +ist foreach my $table (@tables) { my $ncols = $ncolumns{$table}; # Get number of co +lumns my $pvalues = $values{$table}{$id}; # Get values for t +able/ID if (defined($pvalues)) { push @$pout, @$pvalues; # Save values } else { push @$pout, ( "n.a." ) x $ncols; # Missing value = +N/A } } } # Verify results print "=== Verify results ===\n"; foreach my $id (@ids) { my $pvalues = $output{$id}; printf "%12.12s | %s\n", $id, join(" | ", @$pvalues); }

        Thanks everybody for your help! Regards, Evgeniy