Re^4: how to sum over rows based on column headings in perl

Actually, I think I did reply too fast! Obviously i don't understand map function the way you used it.

Could you explain your usage of map? E.g. I changed "M" to "MUTS" and the results are screwed up. Wherever you used "M", I changed it to "MUTS" and changed "M" in datafile to "MUTS" and I don't get the expected result.

map { $_ eq $label ? 'MUTS' : '_' } @headers[1..$#headers];
[download]

My modified code gives wrong results

#!/usr/bin/perl

use strict;
use warnings;
use autodie;

if (@ARGV != 1){
    print "USAGE: ./parse-counts.pl file\n";
    exit(1);
}

my $mutfile = $ARGV[0];
open(INPUTR,"<$mutfile") or die "Can't open \$mutfile for reading. \n"
+;

my (%counts, %unique, %masks);
my ($headname, @unique) = grep !$unique{$_}++, my @headers = split /\t
+/, <INPUTR>;

# the basic syntax is @out = map { CODE } @in;

for my $label ( @unique )
{
    $masks{$label} = join "\t",
    map { $_ eq $label ? 'MUTS' : '_' } @headers[1..$#headers];
}

my $line;

while($line=<INPUTR>)
#while(<DATA>)
{
    chomp $line;
    $line =~ s/\t/\t/g; # for uniform spacing
    my ($name, $letters) = split /\t/, $line, 2;
    $counts{$name}{$_} += ($masks{$_} | $letters) =~ /MUTS/ for @uniqu
+e;
    print $name."\n";
    print $letters."\n";
}

print "$headname @unique\n";
print "$_ @{ $counts{$_} }{@unique}\n" for sort keys %counts;


The output produced is:
Gname G1 G2 G3

A 3 2 0
B 2 1 1
C 2 0 0

The modified datafiles is
Gname   G1      G1      G1      G1      G2      G2      G3
A       W       W       MUTS    W       W       W       MUTS
A       W       W       W       W       W       W       W
A       W       W       W       W       W       W       W
B       W       W       W       W       W       MUTS    MUTS
B       MUTS    W       W       W       W       MUTS    MUTS
C       MUTS    MUTS    MUTS    W       W       W       W
C       MUTS    W       W       MUTS    MUTS    W       W
[download]

Comment on Re^4: how to sum over rows based on column headings in perl Select or Download Code

Replies are listed 'Best First'.
Re^5: how to sum over rows based on column headings in perl by Anonymous Monk on Jul 31, 2015 at 19:29 UTC
No, the heart of the algorithm is the or operator, and changing M to MUTS breaks alignment that allows the or operator to work properly. Print out both sides of the or operator to see the misalignment.	[reply]
Re^6: how to sum over rows based on column headings in perl by angerusso (Novice) on Jul 31, 2015 at 19:58 UTC
Hmmm ... I am not very comfortable with interpreting $_. I don't know how to print $masks{$_} to understand what it is. That is my problem. I know I am dumb! `print $name."\n"; print $letters."\n"; print $masks{$_}."\n";` [download] `which gives obvious error: A W W M W W W M Use of uninitialized value $_ in hash element at ./parse-counts.pl lin +e 37, <INPUTR> line 2. Use of uninitialized value within %masks in concatenation (.) or strin +g at ./parse-counts.pl line 37, <INPUTR> line 2.` [download]	[reply] [d/l] [select]
Re^7: how to sum over rows based on column headings in perl by Anonymous Monk on Jul 31, 2015 at 20:09 UTC
Replace `$counts{$name}{$_} += ($masks{$_} \| $letters) =~ /M/ for @unique;` [download] with `for my $label (@unique) { print "$masks{$label} \| $letters\n"; $counts{$name}{$label} += ($masks{$label} \| $letters) =~ /M/; }` [download] This will allow you to see what you are or'ing	[reply] [d/l] [select]