There's a fair chunk of unconventional and sloppy code in there which I haven't time to comment on blow by blow, so instead here's the first chunk of the code cleaned up somewhat:
use strict; use warnings; my $start_time = time; my ($input1, $input2) = @ARGV; open my $in, '<', $input1 or die "Can't read source file $input1 : $!\ +n"; my @lengths = grep{! m/\>/} <$in>; close $in; chomp @lengths; open $in, '<', $input2 or die "Can't read source file $input2 : $!\n"; my @source = <$in>; close $in; chomp @source; #********************# # CALCULATE LENGTH DISTRIBUTION FROM INPUT FILE #1 #********************# my @sorted = sort {$a <=> $b} @lengths; my %seen; my @uniques = grep {!$seen{$_}++} @sorted; # hash of predicted sORF length (key) and number of times (value) that + size is # observed in the multifasta input file #1 my %dstrbtn_hash; for my $len (@uniques) { dstrbtn_hash{$len} = grep{$len == $_} @sorted; }
which probably doesn't solve the problem, but maybe points you in the direction of better technique.
I suspect the real issue is in the EXTRACT and START "loops". I suspect that depending on input data those loops could spend an indeterminately long time not achieving much. A small sample of your input data would help understand what's supposed to be going on there and find a more deterministic way of calculating the values you need.
In reply to Re: Speeding up stalled script
by GrandFather
in thread Speeding up stalled script
by onlyIDleft
For: | Use: | ||
& | & | ||
< | < | ||
> | > | ||
[ | [ | ||
] | ] |