robertkraus has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I do not program Perl very often, and if I do it is mainly on a line by line basis. I'm often running into issues when I have to manipulate data across columns. What I need currently is to stack columns of a table onto each other. My data looks like:
foo1 1 4 7 foo2 2 5 8 foo3 3 6 9
And I need it in
foo1 1 foo2 2 foo3 3 foo1 4 foo2 5 foo3 6 foo1 7 foo2 8 foo3 9
That is, in column one are the identifiers, and the numbers following from the 2nd column are its values. Obviously I have to read in the whole table and then print column one followed by its values in column 2, then print column 1 again, followed by its values in column 3, and so forth. After some searching it seems that I need to build a hash of arrays, and from that call the data from a loop? However, as I said, I usually do text manipulations line by line, and what I found so far was rather confusing. I hope someone can help me here. Cheers, Robert

Replies are listed 'Best First'.
Re: Stack table columns with hash of arrays
by davido (Cardinal) on May 31, 2011 at 08:35 UTC

    ikegami's solution is simple and to the point. It's really the most elegant solution I see, but it does rely on the fact that your input data has no values repeated among the table columns. In other words, foo1 has '1', and '1' appears nowhere else in the table. That may be a perfectly good assumption, and may work just fine for you, in which case, stop reading now. ;)

    If the numeric values in the table are non-unique, the solution doesn't work, because the hash used to transform the table is keyed off of the table's numeric values. And as we know, hash keys are unique.

    So, I came up with an alternate solution, which is far less elegant, but not constrained by unique tabular data. The numeric fields may be non-unique.

    use strict; use warnings; use feature qw(say); my %foos; # Grab the table and put it into a hash of arrays. while( my $line = <DATA> ) { chomp $line; my( $foo, @values ) = split /\s+/, $line; $foos{$foo} = \@values; } # Hang onto the foos as keys in sorted order. my @foo_keys = sort keys %foos; # Figure out how wide the table is (must be uniform across foos). my $last_col = $#{ $foos{ $foo_keys[0] } }; foreach my $col ( 0 .. $last_col ) { print map { "$_\t$foos{$_}->[$col]\n" } @foo_keys; } __DATA__ foo1 1 4 7 foo2 2 5 8 foo3 3 6 9

    My solution takes a shortcut which is also based on an assumption that may not be true. My solution relies on all rows having the same number of columns. If that were not the case, you would have to track how many columns each row has, and iterate accordingly. That adds a little complexity that I didn't think was needed. But I mention it because it would need to be dealt with if your rows were not of uniform length.


    Dave

      Dear Dave! Great you posted this. In the meanwhile I was actually adjusting the previous solution to my real data and found out that ikegami's solution indeed 'only' fits my example, which I made up to convey what I am after. But not not my real data! Because yes, it ignores multiple occurrences of values. And it would sort and shuffle around not only the keys (which is what I need), but also the values (which is undesired). I was just about to prepare a post to explain what does, and what doesn't work. But thanks to you both! That was great help! Cheers, Rober
      Dear Dave,  my $last_col = $#{ $foos{ $foo_keys[0] } }; brings -1. Where do I make a mistake? Thanks!

        When you're reading in the file, skip non-tabular (such as blank) lines.


        Dave

Re: Stack table columns with hash of arrays
by jwkrahn (Abbot) on May 31, 2011 at 09:08 UTC
    $ echo "foo1 1 4 7 foo2 2 5 8 foo3 3 6 9" | perl -e' my @data; while ( <> ) { push @data, [ split ] } while ( @{ $data[ 0 ] } > 1 ) { print map "$_->[0]\t@{[ splice( @$_, 1, 1 ) ]}\n", @data } ' foo1 1 foo2 2 foo3 3 foo1 4 foo2 5 foo3 6 foo1 7 foo2 8 foo3 9
      Wow! This works pretty neatly, too! No need for a hash, and actually cool because this means there is no issue with sorting. It comes out as it goes in. Cool stuff!
Re: Stack table columns with hash of arrays
by ikegami (Patriarch) on May 31, 2011 at 07:58 UTC
    my %foos; while (<>) { my ($foo, @bars) = split; $foos{$_} = $foo for @bars; } print("$foos{$_}\t$_\n") for sort { $a <=> $b } keys %foos;

    Update: Fixed keys @foos.

      Thanks a million! And thanks for the quick update fix, I was already trying to find the cause of the error that occurred by "keys @foos" instead of "keys %foos". I changed the print statement (to flip the columns) to
      print("$foos{$_}\t$_\n")
      Cheers, Robert
Re: Stack table columns with hash of arrays
by Marshall (Canon) on May 31, 2011 at 09:49 UTC
    Another solution for you.
    #!/usr/bin/perl -w use strict; my @RowNames; my @Columns; while (<DATA>) { my $i=0; my ($name, @row)= split; push @RowNames, $name; push @{$Columns[$i++]}, $_ for (@row); } foreach my $cref (@Columns) { foreach my $name (@RowNames) { print "$name\t",shift @$cref,"\n"; } } =prints foo1 1 foo2 2 foo3 3 foo1 4 foo2 5 foo3 6 foo1 7 foo2 8 foo3 9 =cut __DATA__ foo1 1 4 7 foo2 2 5 8 foo3 3 6 9