Your logic is not too far off. One way to deal with your issue with minor changes to code would be to add a layer of depth to
hash of arrays:
#!/usr/bin/perl -w
use strict;
use List::Util qw(max);
use List::Util qw(min);
#my $input0 = $ARGV[0];
#open (DATA,$input0) || die "cannot open input0";
my %gene_hash;
while(<DATA>)
{
chomp;
my ($chr, $start, $end, $gene, $ex) = split(/\t/, $_);
my $gene_key = $chr.":".$gene;
push( @{ $gene_hash{$gene_key}{start} }, $start );
push( @{ $gene_hash{$gene_key}{end} }, $end );
push( @{ $gene_hash{$gene_key}{ex} }, $ex );
}
foreach my $key (keys %gene_hash)
{
my ($c, $g) = split(/\:/, $key );
print "$c\t$g\t";
my $Low=min( @ {$gene_hash{$key}{start} } );
my $High=max( @ {$gene_hash{$key}{end} } );
my $High_ex=max( @ {$gene_hash{$key}{ex} } );
{
print "$Low\t$High\t$High_ex";
}
print "\n";
}
__DATA__
chrX 2680092 2744539 XG 1
chrX 2680090 2744529 XG 2
chrX 2680080 2744519 XG 3
chrX 2680070 2744509 XG 4
chrX 2680070 2744509 DT 1
chrX 2680090 2744519 DT 2
If the modification is unclear, you can use
Data::Dumper to output the resultant structure by adding the following to the end of your script:
use Data::Dumper;
print Dumper \%gene_hash;
A couple of minor things you may consider in addition:
- You should probably get into the habit of using 3-argument open instead of 2-argument open; the difference is explained in perlopentut.
- You might also consider swapping to Indirect Filehandles. This can become important in larger projects.
- split acts on $_ if no argument is given, so you could change that call on line 13 to = split(/\t/);
- You delimit your keys with ':'; if you are going to create an amalgam key, you should use a character that is guaranteed not to appear in your file - might I suggest "\t"? That way you don't have to split it again for output.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.