in reply to Compressing String for Printing
What have you tried so far? Have you looked at the gzip or bzip2 programs? You can compress your file with them and then read from your file by opening a pipe to them:
my $packer = 'bzip2'; my $file = 'data.txt.bz2'; open my $fh, "$packer -cd $file |" or die "Couldn't decompress '$file': $!/$?";
Alternatively, you could encode each of the four characters into two bits, thus storing four characters per byte. I guess this approach won't be more efficient space-wise than the gzip or bzip2 approach, but it retains the ability to do random reading in your file:
use strict; my %charmap = ( A => '00', C => '01', G => '10', T => '11', ); my $string = 'GATTACA'; $string =~ s/(.)/$charmap{$1}/ge; print "$string\n"; my $compressed = pack 'b*', $string; print "$compressed\n"; printf "%d bytes\n", length $compressed; # now use vec() to get at the single parts of $compressed my $decompressed = unpack 'b*', $compressed; print "$decompressed\n";
But have you looked at BioPerl? I'm pretty sure that they have support for that stuff.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Compressing String for Printing
by eye (Chaplain) on Dec 26, 2008 at 07:15 UTC |