sophix has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I have a hash of the following format

$VAR1 = { 'A2M' => [ 'C972Y', 'A423W' ], 'A4GALT' => [ 'W261X', 'P251L', 'P251S', 'A219V' ] };

What I would like to do is to count the values for each key and also to note the number of values that contain an X. Thus, I would like to have a table like this:

gene total x A2M 2 0 A4GALT 4 1

The code I used to load the file into the hash is

my %hash; while(<INFILE>) { chomp; my $line = $_; my ($key, @val) = split /\t/, $line, 2; push @{ $hash{$key} }, @val; }

And the script that I though would work for the second part but did not is as follows:

my $totalcount = 0; my $xcount = 0; foreach my $k (sort keys %hash) { print "Current key is\t" . $k . "\n"; my @values = $hash{$k}; foreach $i (@values){ $xcount++ if $i =~ /(X)/; $totalcount++; } print $k . "\t" . $totalcount . "\t" . $xcount . "\n";

I would appreciate any help. when I try to check whether the value array works properly, I noticed that it does not, however, I could not pinpoint what is exactly wrong with the values array.

Thank you.

Replies are listed 'Best First'.
Re: Count values in a hash
by toolic (Bishop) on Jan 11, 2011 at 17:49 UTC
    You need to dereference the array.
    use warnings; use strict; my %hash = ( 'A2M' => [ 'C972Y', 'A423W' ], 'A4GALT' => [ 'W261X', 'P251L', 'P251S', 'A219V' ] ); for my $k (sort keys %hash) { print "Current key is\t" . $k . "\n"; my $totalcount = @{ $hash{$k} }; my $xcount = grep { /X/ } @{ $hash{$k} }; print $k . "\t" . $totalcount . "\t" . $xcount . "\n"; } __END__ Current key is A2M A2M 2 0 Current key is A4GALT A4GALT 4 1
    See also: grep and perldsc

      Thank you very much, toolic.

      Does @{ $hash{$k} } work as a reference? What does  @{} do?

Re: Count values in a hash
by rir (Vicar) on Jan 11, 2011 at 18:29 UTC
    Your line
    my ($key, @val) = split /\t/, $line, 2;
    is erroneous; it is creating an array with one string value containing all of $line after the first tab.

    Be well,
    rir

Re: Count values in a hash
by jwkrahn (Abbot) on Jan 11, 2011 at 21:42 UTC
    for my $key ( sort keys %hash ) { print "Current key is\t$key\n"; my $xcount = "@{ $hash{ $key } }" =~ tr/X//; my $totalcount = @{ $hash{ $key } }; print "$k\t$totalcount\t$xcount\n"; }
Re: Count values in a hash
by ack (Deacon) on Jan 11, 2011 at 21:53 UTC

    Head the other responders' replies. Failure to de-reference your hash and array are the source of the difficulty as the other responders noted.

    Here is a quick, short set of code that does what I believe your looking to do. I include it just so that you can see one way to do it. Of course, as in all of Perl, TMTODI.

    Cheers.

    #!/usr/bin/perl use strict; use warnings; my $VAR1 = { 'A2M' => [ 'C972Y', 'A423W' ], 'A4GALT' => [ 'W261X', 'P251L', 'P251S', 'A219V' ] }; my $numKeys = scalar(keys %{$VAR1}); print "gene total x\n"; foreach my $key (keys %{$VAR1}){ my @values = @{$VAR1->{$key}}; my $numValues = scalar(@values); my $numXs = 0; $numXs += ($_ =~ /X/g)?1:0 foreach(@values); print "$key $numValues $numXs\n"; } exit(0);

    The resulting output is:

    gene total x A2M 2 0 A4GALT 4 1

    Which is what I believe you are after.

    The only difficulty is that the pattern match, $_ =~ /X/g is not very general...it wouldn't, for instance, see multiple occurances of 'X' in a value (you, the OP, didn't specify what to do if such an occurance presented itself) nor would it distinguish upper case 'X' from a lower case 'x' (you didn't specify if that mattered). But I figured either you'd already know how to handle those situations or they would be good explorations of Perl for you.

    Also note that I just initialize the hash with the example that you gave rather than reading it in from a file because I felt that it would just obscure the main body.

    ack Albuquerque, NM
Re: Count values in a hash
by suhailck (Friar) on Jan 12, 2011 at 07:55 UTC
    use strict; use warnings; my %hash=( 'A2M' => [ 'C972Y', 'A423W' ], 'A4GALT' => [ 'W261X', 'P251L', 'P251S', 'A219V' ] ); print map { my $count=0; my $x=0; map { $count++;$x++ if /X/ } @{$hash{$_}}; join "\t",$_,$count,$x,"\n"; } keys %hash;