AWallBuilder has asked for the wisdom of the Perl Monks concerning the following question:

Hello all,

I have 4 hashes for counting certain hits for many contigs and I would like to print the combined output as a table. For ex. $Hit1{$contigA}=4, $Hit2{$contigA}=5, $Hit4{$contigA}=1 I would like the following output:

contigA 4 5 0 1 contigB 10 1 1 0 etc.

looking for inspiration/help I found this subroutine zup. However, I am slightly confused by zup, and can't see how to extend it too hashes, also I would have to put in some check and if a count doesnt exist for that contig/Hit combo then make it zero.

any help is appreciated

##zup
#!/usr/bin/perl use strict; use warnings; no warnings qw /syntax/; sub zup { join "\n" => map {join " " => map {shift @$_} @_} @{$_ [0]} } my @array1 = qw /ab bc cd de/; my @array2 = qw /cc dd ee gg/; my @array3 = qw /12 34 56 78/; print zup \(@array1, @array2, @array3); print "\n";
##my script so far
use strict; use warnings; my $infile=$ARGV[0]; open (IN,$infile) or die "cannot open $infile"; my %count_Bact; my %count_undef; my %count_ProphVir; my %count_Euk; while (my $line=<IN>) { chomp $line; next if ($line =~/^#/) ; my ($geneid,$cons_flag)=split(/\t/,$line); my @geneid_columns=split(/_/,$geneid); my $contig=join("_",$geneid_columns[0],$geneid_columns[1],$gen +eid_columns[2],$geneid_columns[3]); if ($cons_flag eq "Bact"){ $count_Bact{$contig} ++; } if ($cons_flag eq "undef"){ $count_undef{$contig} ++; } if ($cons_flag eq "Proph_Vir" || $cons_flag eq "Vir" || $cons_ +flag eq "Proph"){ $count_ProphVir{$contig} ++; } if ($cons_flag eq "Euk"){ $count_Euk{$Euk} ++ ; } } close (IN);

Replies are listed 'Best First'.
Re: print multiple hashes as multiple columns
by johngg (Canon) on Jun 29, 2012 at 22:54 UTC

    I think it would make things easier to have a single HoH keyed by your "flag" values with sub-hashes of "contig" and count key/value pairs. This would facilitate accessing everything via keys rather than having to pass separate hashes into a subroutine. It might also look a bit neater if you use printf to make columns line up. Something along these lines perhaps?

    knoppix@Microknoppix:~$ perl -Mstrict -Mwarnings -E ' > my %counts = ( > Bact => { A => 3, B => 7, C => 6 }, > Euk => { B => 5, C => 9 }, > Proph_Vir => { A => 8, C => 4, D => 2 }, > ); > my @sortedKeys = do { > my %seen; > sort grep { not $seen{ $_ } ++ } > map { keys %{ $counts{ $_ } } } > keys %counts; > }; > > printf qq{%-10s%4s%4s%4s%4s\n}, q{}, @sortedKeys; > foreach my $flag ( sort keys %counts ) > { > printf qq{%-10s%4s%4s%4s%4s\n}, > $flag, > map { exists $counts{ $flag }->{ $_ } > ? $counts{ $flag }->{ $_ } > : 0 > } @sortedKeys; > }' A B C D Bact 3 7 6 0 Euk 0 5 9 0 Proph_Vir 8 0 4 2 knoppix@Microknoppix:~$

    A minor niggle with your code; why chomp comment lines and then discard them? It would make more sense to swap the two lines so you only chomp data lines.

    I hope this is helpful.

    Cheers,

    JohnGG

Re: print multiple hashes as multiple columns
by dulwar (Monk) on Jun 29, 2012 at 23:17 UTC

    $Euk is not initilized in your code before being used in:

        $count_Euk{$Euk} ++ ;

    but I assume it was supposed to be:

        $count_Euk{$contig} ++ ;

    Then the following, slightly modified script should do roughly what you want:

    use strict; use warnings; my $infile=$ARGV[0]; open (IN,$infile) or die "cannot open $infile"; my %count_Bact; my %count_undef; my %count_ProphVir; my %count_Euk; my %processed; while (my $line=<IN>) { chomp $line; next if ($line =~/^#/) ; my ($geneid,$cons_flag)=split(/\t/,$line); my @geneid_columns=split(/_/,$geneid); my $contig=join("_",$geneid_columns[0],$geneid_columns[1],$gen +eid_columns[2],$geneid_columns[3]); if ($cons_flag eq "Bact"){ $count_Bact{$contig} ++; } if ($cons_flag eq "undef"){ $count_undef{$contig} ++; } if ($cons_flag eq "Proph_Vir" || $cons_flag eq "Vir" || $cons_ +flag eq "Proph"){ $count_ProphVir{$contig} ++; } if ($cons_flag eq "Euk"){ $count_Euk{$contig} ++ ; } $processed{$contig}++; } close (IN); my @counts = (\%count_Bact, \%count_undef, \%count_ProphVir, \%count_E +uk); for my $contig (keys %processed) { print join (' ', $contig, map { $_->{$contig} || 0} @counts) + , "\n"; }
      Thankyou, this (well both posts) were very helpfull