Re^3: Merging/Rearranging Tables

"Thank you a lot!"

You're welcome a lot!

"Will the following code work for me?"

Well, I'm tempted to ask -- "what happens when you try?". The best way to learn is, after all, by trying.

I do see that you're trying to initialize @files from a list reference ['tab1.txt','tab2.txt']; which likely won't do what you're expecting (you'll get a single item in @files, which itself is a reference to the 2-item list).

Better to declare it like this:

my @files = ('tab1.txt','tab2.txt');

# Or, using the "quote-word" function "qw",
# which lets you omit the quotes and the comma:

my @files = qw( tab1.txt tab2.txt );
[download]

Furthermore, you're creating a variable $hash_key which you're never assigning to, but rather trying to perform a regex substitution on with:

    my $hash_key =~ s/\.txt//; # generate hash key
[download]

That's why, when I run it with use strict; and use warnings; I get the error:

Use of uninitialized value in substitution (s///) at merge.pl line 16.
[download]

So I'm assuming what you want instead is to assign to the filename $file, and then perform the substitution to get the resulting hash key:

    (my $hash_key = $file) =~ s/\.txt//; # generate hash key
[download]

A final thought: make liberal use of Data::Dumper to see what data a given data structure contains at any time. For example, to see the entire contents of $ptables after making each assignment:

$ptables{$hash_key} = [ @string_array ]; # save all strings to the has
+h
print Dumper(\%ptables);                 # use "\" to pass reference o
+f hash
[download]

Update: fixed typo (thanks johngg).

s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/

Comment on Re^3: Merging/Rearranging Tables Select or Download Code

Replies are listed 'Best First'.
Re^4: Merging/Rearranging Tables by homeveg (Acolyte) on Feb 11, 2007 at 11:30 UTC
Well, you are right, it is better to try before to ask, but I thought maybe I did some principle mistake in data assignment to the hash structure that one can identify just by looking on it. Concerning @files and $hash_key inititalization - it was a mistake. I should have check it before, sorry. I'll try with the Data:Dumper. Thanks for your comments. Cheers, Evgeniy	[reply]
Re^5: Merging/Rearranging Tables by homeveg (Acolyte) on Feb 11, 2007 at 12:05 UTC
Finally, working version of the script: Input: Tab1.txt `ID column \| column 1 gene 1 \| value 1.1 gene 2 \| value 2.1 gene 4 \| value 4.1 gene 8 \| value 8.1` [download] Tab2.txt `ID column \| column 1 \| column2 gene 1 \| value 1.1 \| n.a. gene 3 \| value 3.1 \| value 3.2 gene 4 \| value 4.1 \| value 4.2` [download] To run the script, provide it with file names as argumeents. Files should be in the same filder as a script. The table merging script is below: # Strict use strict; use warnings; # Libraries use Data::Dumper; #variables definition my (@filenames, @strings, @text); #read user defigned tables filenames; if(!$ARGV[1]){ die "Please provide with the at least 2 file names. Good luck!!!"; }; foreach my $element (@ARGV) { if ($element =~ /-help=/i) { print STDERR "Please provide with the at least 2 filenames\n"; + exit; } else { push (@filenames, $element); } } # Define master hash "$ptables" my $ptables; #read files and add data to the HoA: foreach my $file (@filenames) { my @string_array; #read input file and define arrays of strings open (FILE, "<$file") or die "$!"; while (<FILE>) { for my $chank (split /\n/) { push (@string_array, $chank);} } close (FILE); (my $hash_key = $file) =~ s/\.txt//; # generate hash key $ptables->{$hash_key} = [ @string_array ]; # save all strings to t +he hash print Dumper(\$ptables); undef @string_array; } # Globals my %output; my %ncolumns; my %values; my @tables = (sort keys %$ptables); # Get all table na +mes # Main program # First pass -- parse each table to fetch all the IDs print "=== Pass 1 ===\n"; foreach my $table (@tables) { my $ptab = $ptables->{$table}; # Assign to table my @rows = split(/\s\\|\s/, shift @$ptab); # Get column headi +ngs shift @rows; # Discard "ID colu +mn" my $ncols = @rows; # Find number of c +olumns $ncolumns{$table} = $ncols; # Save # of column +s print "Reading $table; $ncols col(s)...\n"; # Announce table n +ame foreach my $line (@$ptab) { my ($id,@vals) = split(/\s\\|\s/, $line); # Get ID and value +s $output{$id} \|\|= [ ]; # Placeholder for +ID $values{$table}{$id} = [ @vals ]; # Save values for +table/ID } } # Second pass -- process each ID, adding values from each table my @ids = (sort keys %output); print "=== Pass 2 ===\n"; foreach my $id (@ids) { print "Processing ID $id\n"; my $pout = $output{$id}; # Get current ID l +ist foreach my $table (@tables) { my $ncols = $ncolumns{$table}; # Get number of co +lumns my $pvalues = $values{$table}{$id}; # Get values for t +able/ID if (defined($pvalues)) { push @$pout, @$pvalues; # Save values } else { push @$pout, ( "n.a." ) x $ncols; # Missing value = +N/A } } } # Verify results print "=== Verify results ===\n"; foreach my $id (@ids) { my $pvalues = $output{$id}; printf "%12.12s \| %s\n", $id, join(" \| ", @$pvalues); } [download] Thanks everybody for your help! Regards, Evgeniy	[reply] [d/l] [select]