in reply to Byte allign compression in Perl..
Here is an implementation of the MSB serialisation strategy using fixed width 4 byte ints for simplicity.This task really asks to be implemented in C but can of course be done in Perl. << is a bitshift operator | is binary OR and & is binary AND.
use strict; use Data::Dumper; my $data = { 1 => [ 1..10 ], 2 => [ 11..20 ], 3 => [ 21..30 ], }; my $MSB = 1<<31; print "$MSB is MSB in decimal\n"; my $SET_MSB = pack "I",$MSB; printf "%s SET MASK in binary\n", (unpack "b32", $SET_MSB); my $UNSET_MSB = pack "I",$MSB-1; printf "%s UNSET MASK in binary\n", (unpack "b32", $UNSET_MSB); my $res = serialise($data); my $dat = unserialise($res); print Data::Dumper::Dumper($dat); sub serialise { my $data = shift; my $str = ''; for my $doc_id( keys %$data ) { $str .= (pack "I", $doc_id) | $SET_MSB; for my $pos ( @{$data->{$doc_id}} ) { $str .= (pack "I", $pos) & $UNSET_MSB; } } return $str; } sub unserialise { my $data = shift; my $dat = {}; my $doc_id = undef; for( my $i=0;$i<length $data; $i+=4) { my $int = unpack "I", (substr $data,$i); if ( $int > $MSB ) { $doc_id = unpack "I", ((pack "I", $int) & $UNSET_MSB); } else { push @{$dat->{$doc_id}}, $int; } } return $dat; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Byte allign compression in Perl..
by MimisIVI (Acolyte) on Apr 11, 2008 at 17:18 UTC | |
by tachyon-II (Chaplain) on Apr 11, 2008 at 17:34 UTC | |
by MimisIVI (Acolyte) on Apr 11, 2008 at 17:51 UTC | |
by tachyon-II (Chaplain) on Apr 11, 2008 at 19:11 UTC | |
by tachyon-II (Chaplain) on Apr 12, 2008 at 03:43 UTC | |
by BrowserUk (Patriarch) on Apr 12, 2008 at 04:37 UTC | |
|