in reply to Re: Selecting one of two implementations
in thread Selecting one of two implementations
The underlying data are really files. Groups of files. The files do not contain interesting data, the data is the files. Files that could be hundreds of MB. Files that could be a couple KB. And both of these, together.
To use Cache::FileCache, we would likely need to also use Archive::Tar to create the file that was cached, and then to pull the files out of the file cache. We have lots of memory - but not that much. Since Archive::Tar loads data in to memory, it's a bit cost prohibitive.
Would you use Cache::*Cache modules for, say, CPAN? Perhaps - because there are only ever two files per item of interest (module distribution): the tarball and the checksum, meaning you always have to query the cache for exactly two files, and one's name is dependant on the other. I have an unknown number of files associated together, some of which are themselves tarballs (but, again, I'm not interested in the contents). Some of these files are related in such a way that a mostly-simply regex could extrapolate one from the other. Others are not related in name, and each insert/retrieval must hardcode each name in the first scenario, while the second puts them together.
In the first scenario, Cache::FileCache works nearly identically to the existing code (except that existing code doesn't need to load 250MB tarballs into memory before writing them back out). The second scenario attempts to resolve this natural grouping inside the complexities of the module. The module takes file handles as a data input style, and places it directly to a data store without, again, loading the entire file into memory.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Selecting one of two implementations
by merlyn (Sage) on Apr 25, 2005 at 15:29 UTC | |
by Tanktalus (Canon) on Apr 25, 2005 at 15:43 UTC |