I want to obtain as a result a file like this:chrX 2680092 2744539 XG 1 chrX 2680090 2744529 XG 2 chrX 2680080 2744519 XG 3 chrX 2680070 2744509 XG 4 chrX 2680070 2744509 DT 1 chrX 2680090 2744519 DT 2
So basically I need to group by column 1 and 4, and obtain min value for column 2, max value for column 3 and max value for column 5. I've tried with this code:chrX 2680070 2744539 XG 4 chrX 2680070 2744519 DT 2
but I don't know how to create different arrays for the hash so basically I push in the same array all the values... Obviously the result is a mess:#!/usr/bin/perl -w use strict; use List::Util qw(max); use List::Util qw(min); my $input0 = $ARGV[0]; open (DATA,$input0) || die "cannot open input0"; my %gene_hash; while(<DATA>) { chomp; my ($chr, $start, $end, $gene, $ex) = split(/\t/, $_); my $gene_key = $chr.":".$gene; push( @{ $gene_hash{$gene_key} }, $start ); push( @{ $gene_hash{$gene_key} }, $end ); push( @{ $gene_hash{$gene_key} }, $ex ); } foreach my $key (keys %gene_hash) { my ($c, $g) = split(/\:/, $key ); print "$c\t$g\t"; my $Low=min( @ {$gene_hash{$key} } ); my $High=max( @ {$gene_hash{$key} } ); my $High_ex=max( @ {$gene_hash{$key} } ); { print "$Low\t$High\t$High_ex"; } print "\n"; } __DATA__
Can you help me?chrX XG 1 2744539 2744539 chrX DT 1 2744519 2744519
Many thanks!
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |