I don't think such module exists. In order to start reading data from some position you have to decode file from start, so to seek backward you either should keep all decoded data till current position and this would require a lot of memory, or should for every backward seek request decode file from the beginning and this would require a lot of CPU cycles.
| [reply] |
Since the documentation says thats the module doesn't support backward seeks, it seems that you have provided the answer.
Assuming that you can only seek forward, its an interesting problem how to search in an efficient manner. For starters, its probably a good idea to check if Compress::Zlib does in fact really seek, or if it just providing an abstraction that pretends to seek, while it decompresses everything underneath the hood. In that case, you're most likely to be better off doing linear search.
| [reply] |
maybe you should:
1. ask the author
2. search another module
AFAIK, the zlib only supports backward seek on read only mode, see zlib manual, so I don't know why Compress::Zlib don't support backward seek at all. | [reply] |
Assuming you are only searching on a small part of each record you could create a key file and store the keys and their offsets, much like is done in a database.
If not, then maybe you could keep a subset of keys, maybe 1 out of 100, then you could binary search to the key equal or less than the one you need and seek forward through the 100 records in between... kind of a reasonable trade-off.
- Ant
- Some of my
best work - (1 2 3)
| [reply] |
I think that there is a problem with "seeking" in general on compressed files. Very few compression algorithms have the property where you can just "land" in a random place in the file and start reading. One problem usually is the ability to "re-sync". A .WAV file can't do this! If you have a huge .WAV file and somehow you get "lost" in the middle of it...there is no way to recover to find the next segment..or at least not that I know of... Some types of linear data structures have "garbage" that you have to read through, but that "garbage" is actually a sync point. | [reply] |