What have you tried so far? Have you looked at the gzip or bzip2 programs? You can compress your file with them and then read from your file by opening a pipe to them:
my $packer = 'bzip2'; my $file = 'data.txt.bz2'; open my $fh, "$packer -cd $file |" or die "Couldn't decompress '$file': $!/$?";
Alternatively, you could encode each of the four characters into two bits, thus storing four characters per byte. I guess this approach won't be more efficient space-wise than the gzip or bzip2 approach, but it retains the ability to do random reading in your file:
use strict; my %charmap = ( A => '00', C => '01', G => '10', T => '11', ); my $string = 'GATTACA'; $string =~ s/(.)/$charmap{$1}/ge; print "$string\n"; my $compressed = pack 'b*', $string; print "$compressed\n"; printf "%d bytes\n", length $compressed; # now use vec() to get at the single parts of $compressed my $decompressed = unpack 'b*', $compressed; print "$decompressed\n";
But have you looked at BioPerl? I'm pretty sure that they have support for that stuff.
In reply to Re: Compressing String for Printing
by Corion
in thread Compressing String for Printing
by neversaint
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |