in reply to string to more compact format
How can i convert, for example, 'AGTCACA' to a more compact string with less bits?
Build a hash of all 256 4-tuples of (A, G, T, C), mapping each tuple to one byte. And then iterate over the string, four characters at a time, and use the hash for lookup.
Basically, I want to store these string to a hash and be able to compare them, and see if substrings are availabe.
The compression that you described makes that much harder, unless all substrings are then aligned to byte boundaries (ie blocks of four in the original string).
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: string to more compact format
by timray (Initiate) on May 17, 2010 at 17:30 UTC |