but if you need the first column as well, and if the columns are tab separated, this script would give you the minima of chr1..chrN of column2 of all given files and the maxima of chr1..chrN of column3 of all given files. do you need the file names as well? and i hope i understood the problem right
#!/usr/bin/perl -w
use strict;
my %minmax;
foreach (@ARGV) {
open(F, "< $_");
while(my $line = <F>) {
chop($line);
my @p = split(/\t/,$line);
if(!exists( $minmax{ $p[0] }{ min } )) {
$minmax{ $p[0] }{ min } = $p[1];
}
if($p[1] < $minmax{ $p[0] }{ min }) {
$minmax{ $p[0] }{ min } = $p[1];
}
if(!exists( $minmax{ $p[0] } { max } )) {
$minmax{ $p[0] }{ max } = $p[2];
}
if($p[2] > $minmax{ $p[0] }{ max }) {
$minmax{ $p[0] }{ max } = $p[2];
}
}
close(F);
}
foreach (keys %minmax) {
print "$_: min: $minmax{$_}{min} max: $minmax{$_}{max}\n";
}
for the given input:
file1:
chr1 4 60
chr2 2 40
chr3 4 90
chr1 5 40
file2:
chr2 1 30
chr1 6 20
chr4 9 100
file3:
chr1 2 20
chr2 2 90
chr1 6 20
chr4 4 30
file4:
chr2 4 90
chr3 3 90
chr2 4 90
chr4 3 90
chr2 4 30
it would output:
chr1: min: 2 max: 60
chr2: min: 1 max: 90
chr3: min: 3 max: 90
chr4: min: 3 max: 100
|