Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^2: Run length encode a bit vector

by Anonymous Monk
on Jan 06, 2012 at 00:13 UTC ( [id://946506]=note: print w/replies, xml ) Need Help??


in reply to Re: Run length encode a bit vector
in thread Run length encode a bit vector

-- It will depend on whether you have longish sequences of contiguous ones or zeros.

The one set (of 25 sets) of indexes that I've analyzed, consists of 88 x 31MB vectors. They vary between 86% and 98% sparse (by zero bytes rather than bits).

The largest 0 runs range between 8 and 12 million bits. The largest 1 run is 67 bits. By packing the run counts as 0/1 pairs into 32-bit words, 24-bits for the 0 runs and 8-bits for the one run, I can reduce the size by more 2/3rds and am still able to perform boolean operations with decompressing first.

For the underlying principles see http://crd.lbl.gov/~kewu/ps/LBNL-49626.pdf.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://946506]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2024-04-25 14:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found