limzz has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I'm working on a script that deals with data which has column as well as row titles. Currently I've got it all in a two dimensional array which was working for a while, but I'm realizing it would probably be better to make it into a hash since the column and row names are just begging me to be keys, and I'm finding myself needing to call specific rows and columns. I'm new to Perl and have been struggling to get this into the format I want. For example, here's the format of the data (tab delimited):

Log foo bar baz aaa 123 456 789 bbb 987 654 321

So I want the data in a structure where if I want, say, 789, I just use $hash{aaa}{baz}. The way I thought of to do it seems overly complicated. I won't go into detail, but it involves lots of arrays and looping. If anyone could guide me it the easiest way to do this it would be greatly appreciated and thanks in advance :)

EDIT: To avoid any confusion, there's a lot of data, so it can't be done manually. I can't split by column names either, too many columns, and the number of columns and names can change too.

  • Comment on Making a two dimensional hash with first row and column as the keys
  • Download Code

Replies are listed 'Best First'.
Re: Making a two dimensional hash with first row and column as the keys
by toolic (Bishop) on Jul 06, 2011 at 14:43 UTC
    perldsc
    use warnings; use strict; my $hdr = <DATA>; my @cols = split /\s+/, $hdr; my %hash; # pick a better name while (<DATA>) { my ($row, @vals) = split; for my $i (0 .. $#vals) { $hash{$row}{$cols[$i+1]} = $vals[$i]; } } print "$hash{aaa}{baz}\n"; __DATA__ Log foo bar baz aaa 123 456 789 bbb 987 654 321

      A slight variation on toolic's approach:

      perl -wMstrict -le "my $data = qq{Log foo bar baz \n} . qq{aaa 123 456 789 \n} . qq{bbb 987 654 321 \n} ; ;; open my $fh, '<', \$data or die qq{opening data: $!}; ;; my (undef, @dim2_names) = split /\s+/, <$fh>; print qq{ @dim2_names}; ;; my %hash_2d; while (defined(my $rec = <$fh>)) { my ($row, @vals) = split /\s+/, $rec; print qq{$row = @vals}; @{ $hash_2d{$row} }{ @dim2_names } = @vals; } ;; use Data::Dumper; print Dumper \%hash_2d; " foo bar baz aaa = 123 456 789 bbb = 987 654 321 $VAR1 = { 'bbb' => { 'bar' => '654', 'baz' => '321', 'foo' => '987' }, 'aaa' => { 'bar' => '456', 'baz' => '789', 'foo' => '123' } };

      Are you a wizard? Thanks a lot, this is exactly what I needed :) I was actually closer than I thought, but my way was really wonky.

Re: Making a two dimensional hash with first row and column as the keys
by Anonymous Monk on Jul 06, 2011 at 14:41 UTC
Re: Making a two dimensional hash with first row and column as the keys
by jethro (Monsignor) on Jul 06, 2011 at 14:47 UTC

    Please do go into detail. Just post your code, we can tell you if it is complicated or not

      toolic's reply was what I needed. I promise what I was considering would hurt your eyes to look at heh...

Re: Making a two dimensional hash with first row and column as the keys
by Marshall (Canon) on Jul 07, 2011 at 05:29 UTC
    I like toolic's 2-D hash suggestion. However, sometimes I use just a 1-D hash. Depending upon what you are doing, this can lead to some simplifications, e.g. printing a HoH takes 2 foreach loops, a simple 1-D hash just takes one. On the other hand some operations may become more complicated. I'm just presenting another option that sometimes is a good choice. Mileage varies!

    The relevant line to change n toolic's code is this one:

    #$hash{$row}{$cols[$i+1]} = $vals[$i]; #toolic's $hash{"$row,$cols[$i+1]"} = $vals[$i]; #1-D version
    Now,
    foreach (sort keys %hash) { print "$_ $hash{$_}\n"; } __END__ prints: aaa,bar 456 aaa,baz 789 aaa,foo 123 bbb,bar 654 bbb,baz 321 bbb,foo 987
    The pitfall in this method is that some character or sequence of characters that cannot be in the row or column header names should be used to join the row,col names together. You want to guarantee that running together any row,col results in a unique name. This is usually not a problem and "," is often a good choice.

    Have fun!