Dear Monks, I have a file like this:
chrX 2680092 2744539 XG 1 chrX 2680090 2744529 XG 2 chrX 2680080 2744519 XG 3 chrX 2680070 2744509 XG 4 chrX 2680070 2744509 DT 1 chrX 2680090 2744519 DT 2
I want to obtain as a result a file like this:
chrX 2680070 2744539 XG 4 chrX 2680070 2744519 DT 2
So basically I need to group by column 1 and 4, and obtain min value for column 2, max value for column 3 and max value for column 5. I've tried with this code:
#!/usr/bin/perl -w use strict; use List::Util qw(max); use List::Util qw(min); my $input0 = $ARGV[0]; open (DATA,$input0) || die "cannot open input0"; my %gene_hash; while(<DATA>) { chomp; my ($chr, $start, $end, $gene, $ex) = split(/\t/, $_); my $gene_key = $chr.":".$gene; push( @{ $gene_hash{$gene_key} }, $start ); push( @{ $gene_hash{$gene_key} }, $end ); push( @{ $gene_hash{$gene_key} }, $ex ); } foreach my $key (keys %gene_hash) { my ($c, $g) = split(/\:/, $key ); print "$c\t$g\t"; my $Low=min( @ {$gene_hash{$key} } ); my $High=max( @ {$gene_hash{$key} } ); my $High_ex=max( @ {$gene_hash{$key} } ); { print "$Low\t$High\t$High_ex"; } print "\n"; } __DATA__
but I don't know how to create different arrays for the hash so basically I push in the same array all the values... Obviously the result is a mess:
chrX XG 1 2744539 2744539 chrX DT 1 2744519 2744519
Can you help me?

Many thanks!


In reply to How to group by a column and calculate max/min on another by perl_paduan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.