in reply to how to sum over rows based on column headings in perl

#!/usr/bin/perl # http://perlmonks.org/?node_id=1136784 use strict; use warnings; my (%counts, %unique, %masks); my ($headname, @unique) = grep !$unique{$_}++, my @headers = split ' ' +, <DATA>; for my $label ( @unique ) { $masks{$label} = join ' ', map { $_ eq $label ? 'M' : '_' } @headers[1..$#headers]; } while(<DATA>) { s/\s+/ /g; # for uniform spacing my ($name, $letters) = split ' ', $_, 2; $counts{$name}{$_} += ($masks{$_} | $letters) =~ /M/ for @unique; } print "$headname @unique\n"; print "$_ @{ $counts{$_} }{@unique}\n" for sort keys %counts; # Gname G1 G2 G3 # A 1 0 1 # B 1 2 2 # C 2 1 0 __DATA__ Gname G1 G1 G1 G1 G2 G2 G3 A W W M W W W M A W W W W W W W A W W W W W W W B W W W W W M M B M W W W W M M C M M M W W W W C M W W M M W W

Replies are listed 'Best First'.
Re^2: how to sum over rows based on column headings in perl
by angerusso (Novice) on Jul 30, 2015 at 20:34 UTC
    Very grateful for your response! I will spend sometime understanding your elegant piece of code before I try to run it myself. Thanks so much! I need time to understand it.
Re^2: how to sum over rows based on column headings in perl
by angerusso (Novice) on Jul 31, 2015 at 16:46 UTC

    I am trying to understand your code by modifying it myself and trying to read the data through a file

    #!/usr/bin/perl -w use strict; use warnings; use autodie; if (@ARGV != 1){ print "USAGE: ./parse-counts.pl file\n"; exit(-1); } my $mutfile = $ARGV[0]; my (%counts, %unique, %masks); my ($headname, @unique) = grep !$unique{$_}++, my @headers = split ' ' +, <DATA>; for my $label ( @unique ) { $masks{$label} = join ' ', map { $_ eq $label ? 'M' : '_' } @headers[1..$#headers]; } my $line; open(INPUTR,"<$mutfile") || die "Can't open \$mutfile for reading. \n" +; while($line=<INPUTR>){ #while(<DATA>) { chomp $line; s/\s+/ /g; # for uniform spacing my ($name, $letters) = split ' ', $line, 2; $counts{$name}{$line} += ($masks{$line} | $letters) =~ /M/ for @un +ique; } print "$headname @unique\n"; print "$line @{ $counts{$_} }{@unique}\n" for sort keys %counts;

    I am sorry i don't understand what I did wrong as now I get error as:

    ./parse-counts.pl junk.txt Missing right curly or square bracket at ./parse-counts.pl line 36, at + end of line syntax error at ./parse-counts.pl line 36, at EOF Execution of ./parse-counts.pl aborted due to compilation errors.

      That error is pretty clear! You are missing the right curly bracket closing the while loop you opened on line 26. The end of line 36 is given as the location of the error because that's where Perl runs out of code in which to find the closing bracket.

      You have a stray opening bracket under the line you commented out at line 27. This is a good example of why you should have consistent syntax: your commented-out while statement has the bracket on the next line, while the new while statement has the bracket on the same line. Pick one and stick with it. In my view, putting the brackets on the same line reduces the chances of them going astray.

      Update: explain cause of error

      The way forward always starts with a minimal test.

        Actually, I think I did reply too fast! Obviously i don't understand map function the way you used it.

        Could you explain your usage of map? E.g. I changed "M" to "MUTS" and the results are screwed up. Wherever you used "M", I changed it to "MUTS" and changed "M" in datafile to "MUTS" and I don't get the expected result.

        map { $_ eq $label ? 'MUTS' : '_' } @headers[1..$#headers];

        My modified code gives wrong results

        #!/usr/bin/perl use strict; use warnings; use autodie; if (@ARGV != 1){ print "USAGE: ./parse-counts.pl file\n"; exit(1); } my $mutfile = $ARGV[0]; open(INPUTR,"<$mutfile") or die "Can't open \$mutfile for reading. \n" +; my (%counts, %unique, %masks); my ($headname, @unique) = grep !$unique{$_}++, my @headers = split /\t +/, <INPUTR>; # the basic syntax is @out = map { CODE } @in; for my $label ( @unique ) { $masks{$label} = join "\t", map { $_ eq $label ? 'MUTS' : '_' } @headers[1..$#headers]; } my $line; while($line=<INPUTR>) #while(<DATA>) { chomp $line; $line =~ s/\t/\t/g; # for uniform spacing my ($name, $letters) = split /\t/, $line, 2; $counts{$name}{$_} += ($masks{$_} | $letters) =~ /MUTS/ for @uniqu +e; print $name."\n"; print $letters."\n"; } print "$headname @unique\n"; print "$_ @{ $counts{$_} }{@unique}\n" for sort keys %counts; The output produced is: Gname G1 G2 G3 A 3 2 0 B 2 1 1 C 2 0 0 The modified datafiles is Gname G1 G1 G1 G1 G2 G2 G3 A W W MUTS W W W MUTS A W W W W W W W A W W W W W W W B W W W W W MUTS MUTS B MUTS W W W W MUTS MUTS C MUTS MUTS MUTS W W W W C MUTS W W MUTS MUTS W W
      #!/usr/bin/perl use strict; use warnings; use autodie; if (@ARGV != 1){ print "USAGE: ./parse-counts.pl file\n"; exit(1); } my $mutfile = $ARGV[0]; open(INPUTR,"<$mutfile") or die "Can't open \$mutfile for reading. \n" +; my (%counts, %unique, %masks); my ($headname, @unique) = grep !$unique{$_}++, my @headers = split ' ' +, <INPUTR>; for my $label ( @unique ) { $masks{$label} = join ' ', map { $_ eq $label ? 'M' : '_' } @headers[1..$#headers]; } my $line; while($line=<INPUTR>) #while(<DATA>) { chomp $line; $line =~ s/\s+/ /g; # for uniform spacing my ($name, $letters) = split ' ', $line, 2; $counts{$name}{$_} += ($masks{$_} | $letters) =~ /M/ for @unique; } print "$headname @unique\n"; print "$_ @{ $counts{$_} }{@unique}\n" for sort keys %counts;

        Thanks so much! It works beautifully and I understand it much better now. am also glad I learnt "map" function today. Also learnt how to use a 3rd parameter in the split function. Hoping to use them again for a new codes I plan to write in future. Will practice!