Stack table columns with hash of arrays

robertkraus has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I do not program Perl very often, and if I do it is mainly on a line by line basis. I'm often running into issues when I have to manipulate data across columns. What I need currently is to stack columns of a table onto each other. My data looks like:

foo1    1    4    7
foo2    2    5    8
foo3    3    6    9
[download]

And I need it in

foo1    1
foo2    2
foo3    3
foo1    4
foo2    5
foo3    6
foo1    7
foo2    8
foo3    9
[download]

That is, in column one are the identifiers, and the numbers following from the 2nd column are its values. Obviously I have to read in the whole table and then print column one followed by its values in column 2, then print column 1 again, followed by its values in column 3, and so forth. After some searching it seems that I need to build a hash of arrays, and from that call the data from a loop? However, as I said, I usually do text manipulations line by line, and what I found so far was rather confusing. I hope someone can help me here. Cheers, Robert

Comment on Stack table columns with hash of arrays Select or Download Code

Replies are listed 'Best First'.
Re: Stack table columns with hash of arrays by davido (Cardinal) on May 31, 2011 at 08:35 UTC
ikegami's solution is simple and to the point. It's really the most elegant solution I see, but it does rely on the fact that your input data has no values repeated among the table columns. In other words, foo1 has '1', and '1' appears nowhere else in the table. That may be a perfectly good assumption, and may work just fine for you, in which case, stop reading now. ;) If the numeric values in the table are non-unique, the solution doesn't work, because the hash used to transform the table is keyed off of the table's numeric values. And as we know, hash keys are unique. So, I came up with an alternate solution, which is far less elegant, but not constrained by unique tabular data. The numeric fields may be non-unique. use strict; use warnings; use feature qw(say); my %foos; # Grab the table and put it into a hash of arrays. while( my $line = <DATA> ) { chomp $line; my( $foo, @values ) = split /\s+/, $line; $foos{$foo} = \@values; } # Hang onto the foos as keys in sorted order. my @foo_keys = sort keys %foos; # Figure out how wide the table is (must be uniform across foos). my $last_col = $#{ $foos{ $foo_keys[0] } }; foreach my $col ( 0 .. $last_col ) { print map { "$_\t$foos{$_}->[$col]\n" } @foo_keys; } __DATA__ foo1 1 4 7 foo2 2 5 8 foo3 3 6 9 [download] My solution takes a shortcut which is also based on an assumption that may not be true. My solution relies on all rows having the same number of columns. If that were not the case, you would have to track how many columns each row has, and iterate accordingly. That adds a little complexity that I didn't think was needed. But I mention it because it would need to be dealt with if your rows were not of uniform length. Dave	[reply] [d/l]
Re^2: Stack table columns with hash of arrays by robertkraus (Novice) on May 31, 2011 at 09:05 UTC
Dear Dave! Great you posted this. In the meanwhile I was actually adjusting the previous solution to my real data and found out that ikegami's solution indeed 'only' fits my example, which I made up to convey what I am after. But not not my real data! Because yes, it ignores multiple occurrences of values. And it would sort and shuffle around not only the keys (which is what I need), but also the values (which is undesired). I was just about to prepare a post to explain what does, and what doesn't work. But thanks to you both! That was great help! Cheers, Rober	[reply]
Re^2: Stack table columns with hash of arrays by Anonymous Monk on May 31, 2011 at 12:05 UTC
Dear Dave, `my $last_col = $#{ $foos{ $foo_keys[0] } };` brings -1. Where do I make a mistake? Thanks!	[reply] [d/l]
Re^3: Stack table columns with hash of arrays by davido (Cardinal) on May 31, 2011 at 16:08 UTC
When you're reading in the file, skip non-tabular (such as blank) lines. Dave	[reply]
Re^4: Stack table columns with hash of arrays by Anonymous Monk on Jun 01, 2011 at 11:52 UTC
Re: Stack table columns with hash of arrays by jwkrahn (Abbot) on May 31, 2011 at 09:08 UTC
`$ echo "foo1 1 4 7 foo2 2 5 8 foo3 3 6 9" \| perl -e' my @data; while ( <> ) { push @data, [ split ] } while ( @{ $data[ 0 ] } > 1 ) { print map "$_->[0]\t@{[ splice( @$_, 1, 1 ) ]}\n", @data } ' foo1 1 foo2 2 foo3 3 foo1 4 foo2 5 foo3 6 foo1 7 foo2 8 foo3 9` [download]	[reply] [d/l]
Re^2: Stack table columns with hash of arrays by robertkraus (Novice) on May 31, 2011 at 09:27 UTC
Wow! This works pretty neatly, too! No need for a hash, and actually cool because this means there is no issue with sorting. It comes out as it goes in. Cool stuff!	[reply]
Re: Stack table columns with hash of arrays by ikegami (Patriarch) on May 31, 2011 at 07:58 UTC
`my %foos; while (<>) { my ($foo, @bars) = split; $foos{$_} = $foo for @bars; } print("$foos{$_}\t$_\n") for sort { $a <=> $b } keys %foos;` [download] Update: Fixed `keys @foos`.	[reply] [d/l] [select]
Re^2: Stack table columns with hash of arrays by robertkraus (Novice) on May 31, 2011 at 08:19 UTC
Thanks a million! And thanks for the quick update fix, I was already trying to find the cause of the error that occurred by "keys @foos" instead of "keys %foos". I changed the print statement (to flip the columns) to `print("$foos{$_}\t$_\n")` [download] Cheers, Robert	[reply] [d/l]
Re: Stack table columns with hash of arrays by Marshall (Canon) on May 31, 2011 at 09:49 UTC
Another solution for you. `#!/usr/bin/perl -w use strict; my @RowNames; my @Columns; while (<DATA>) { my $i=0; my ($name, @row)= split; push @RowNames, $name; push @{$Columns[$i++]}, $_ for (@row); } foreach my $cref (@Columns) { foreach my $name (@RowNames) { print "$name\t",shift @$cref,"\n"; } } =prints foo1 1 foo2 2 foo3 3 foo1 4 foo2 5 foo3 6 foo1 7 foo2 8 foo3 9 =cut __DATA__ foo1 1 4 7 foo2 2 5 8 foo3 3 6 9` [download]	[reply] [d/l]