Your logic is not too far off. One way to deal with your issue with minor changes to code would be to add a layer of depth to hash of arrays:

#!/usr/bin/perl -w use strict; use List::Util qw(max); use List::Util qw(min); #my $input0 = $ARGV[0]; #open (DATA,$input0) || die "cannot open input0"; my %gene_hash; while(<DATA>) { chomp; my ($chr, $start, $end, $gene, $ex) = split(/\t/, $_); my $gene_key = $chr.":".$gene; push( @{ $gene_hash{$gene_key}{start} }, $start ); push( @{ $gene_hash{$gene_key}{end} }, $end ); push( @{ $gene_hash{$gene_key}{ex} }, $ex ); } foreach my $key (keys %gene_hash) { my ($c, $g) = split(/\:/, $key ); print "$c\t$g\t"; my $Low=min( @ {$gene_hash{$key}{start} } ); my $High=max( @ {$gene_hash{$key}{end} } ); my $High_ex=max( @ {$gene_hash{$key}{ex} } ); { print "$Low\t$High\t$High_ex"; } print "\n"; } __DATA__ chrX 2680092 2744539 XG 1 chrX 2680090 2744529 XG 2 chrX 2680080 2744519 XG 3 chrX 2680070 2744509 XG 4 chrX 2680070 2744509 DT 1 chrX 2680090 2744519 DT 2
If the modification is unclear, you can use Data::Dumper to output the resultant structure by adding the following to the end of your script:

use Data::Dumper; print Dumper \%gene_hash;

A couple of minor things you may consider in addition:

  1. You should probably get into the habit of using 3-argument open instead of 2-argument open; the difference is explained in perlopentut.
  2. You might also consider swapping to Indirect Filehandles. This can become important in larger projects.
  3. split acts on $_ if no argument is given, so you could change that call on line 13 to = split(/\t/);
  4. You delimit your keys with ':'; if you are going to create an amalgam key, you should use a character that is guaranteed not to appear in your file - might I suggest "\t"? That way you don't have to split it again for output.

In reply to Re: How to group by a column and calculate max/min on another by kennethk
in thread How to group by a column and calculate max/min on another by perl_paduan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.