in reply to Code Optimization
G'day azheid,
Cutting your code down to a skeleton where I think your two biggest problems are:
if(!($permute)){ open(SEQ,'<',$sequence_fname)||die "Cannot open sequence file\n"; while (my @line=split (/\t/,<SEQ>)){ ... } ... { else{ for (...) { ... open(SEQ,'<',$sequence_fname)||die "Cannot open sequence file\ +n";#open the sequence file while (my @line=split (/\t/,<SEQ>)){#for each sequence record +the nmer information ... } close SEQ; open(OUT,">>$out")||die "Cannot open $out\n"; foreach my $key(keys %ktc){ ... print OUT ... ... } close OUT; } }
You open the SEQ file, read the data from disk, and parse it multiple times: you only need to do this once.
Also, you open and close the OUT file for appending multiple times: you only need to do this once.
Without making changes to your coding style, I suspect this, which does only open those files once, would be substantially faster:
my @seq_data; open(SEQ,'<',$sequence_fname)||die "Cannot open sequence file\n"; while (<SEQ>) { push @seq_data, [split /\t/]; } close SEQ; if(!($permute)){ for (@seq_data) { my @line = @$_; ... } ... { else{ open(OUT,">>$out")||die "Cannot open $out\n"; for (...) { ... for (@seq_data) { my @line = @$_; ... } foreach my $key(keys %ktc){ ... print OUT ... ... } } close OUT; }
There may be other areas where substantial gains could be made, but I haven't looked beyond the two I/O ones at this point.
-- Ken
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Code Optimization
by azheid (Sexton) on Sep 10, 2013 at 09:53 UTC | |
by kcott (Archbishop) on Sep 10, 2013 at 10:28 UTC | |
by azheid (Sexton) on Sep 12, 2013 at 19:05 UTC |