I have a UPDATED datafile which has only "W" and "Ms" entries. As in the example, I want to count number of A's which have "M" appearing atleast once over unique column names. I want to sum over rows, not columns. As long as "M" appears in column once, I just count that row as 1.

Gname G1 G1 G1 G1 G2 G2 G3 A W W M W W W M A W W W W W W W A W W W W W W W B W W W W W M M B M W W W W M M C M M M W W W W C M W W M M W W The output should be: Gname G1 G2 G3 A 1 0 1 B 1 2 2 C 2 1 0

I have written the following code to write the header row but I am very confused how should I start counting over blocks/chunks of data like I want. Can anyone help?

#!/usr/bin/perl -w if (@ARGV != 1){ print "USAGE: ./parse-counts.pl file\n"; exit(-1); } $mutfile = $ARGV[0]; %hash = (); open(INPUTR,"<$mutfile") || die "Can't open \$mutfile for reading. \n" +; while($line=<INPUTR>){ chomp $line; @toks = split(/\t/,$line); if ($toks[0] =~ /^Gname/){ $k = 0; # loop over the header row to get the unique "Gname"s @header = split(/\t/,$line); for $j (1..@toks-2){ $i = $j+1; if ($header[$i] ne $header[$j]){ $k++; $name[$k] = $header[$j]; } } for $i (0..$k){ $hash{$toks[0]}{$name[$k]} = $name[$k]; } } else { $k = 0; for $j (1..@toks-2){ $i = $j+1; if ($header[$i] ne $header[$j]){ $k++; $hash{$toks[0]}{$name[$k]} = 0; if ($toks[$j] =~ /M/){ $hash{$toks[0]}{$name[$k]} = 1; } } } } } close(INPUTR); $outdata = $mutfile; $outdata =~ /(.+).txt/; $outdata = $1."-COUNTS.txt"; open(OUTD,">$outdata"); foreach $idname (sort keys %hash){ if ($idname =~ /^Gname/){ print OUTD $idname; foreach $gid (sort keys %{$hash{$idname}}){ print OUTD "\t".$hash{$idname}{$gid}; } print OUTD "\n"; } } foreach $idname (sort keys %hash){ if ($idname !~ /^Gname/){ print OUTD $idname; foreach $gid (sort keys %{$hash{$idname}}){ print OUTD "\t".$hash{$idname}{$gid}; } print OUTD "\n"; } } close(OUTD); print "Printing $outdata file done.\n";

In reply to how to sum over rows based on column headings in perl by angerusso

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.