Just off the top of my head, I'd probably try replacing the 'pack' stuff by producing whole bytes at a time. Something like:
use Algorithm::Loops qw< NestedLoops >; my @bases= qw< A C G T >; my @quad= map { join '', @$_ } NestedLoops( [ ( \@bases ) x 4 ] ); my %byte; @byte{ @quad }= map pack("C",$_), 0 .. %#quad; my $carry= ''; while( <> ) { chomp; substr( $_, 0, 0, $carry ); my $pack= ''; s/(....)/ $pack .= $byte{$1}; '' /g; $carry= $_; print RAM $pack; } print RAM $byte{ substr( $carry . 'AAA', 0, 4 ) } if $carry;
But I haven't looked at the rest of this thread recently nor actually tried my suggestions. I can think of lots of different ways to pull out 4 bases at a time and some ways might have a noticeable impact on speed.
- tye
In reply to Re^3: Sorting Gigabytes of Strings Without Storing Them (bytes)
by tye
in thread Sorting Gigabytes of Strings Without Storing Them
by neversaint
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |