quickest way to access cached data?

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: quickest way to access cached data? by Corion (Patriarch) on May 13, 2004 at 14:37 UTC
In short, there is nothing faster than RAM, so you're best off writing a small C http server that loads all HTML pages into RAM and serves them from there. If you don't want to write a small HTTP server yourself, you can use Apache and a ramdisk to serve the files from. If that still is not possible, maybe because not enough RAM is available on your machine (which is unlikely as even x86 architecture can easily access 2GB of RAM for storage), you can leave the caching on the file level to the OS and simply serve plain files. MySQL starts to get a foot in the door only here possibly, as even MySQL has to do exactly the same things the OS has to do for serving pages. Dynamically creating a page will almost always be slower than piping the data from RAM to the network card and slower than piping the data from disk as well. If you think that you need to recreate data more dynamically than nightly in a cron job, you can consider Apache and an ErrorHandler directive to create "missing", that is, uncached pages and weed out "old" pages with `find` or File::Find every hour. A fully dynamic database driven solution will most likely be the slowest solution possible, as it has the drawback of needing to go through the DB and the filesystem on every page served. Of course, until we know the exact usage patterns and possibly the page sequences, all of this has no meaning. You need to benchmark all solutions to see whether your actual access patterns favour one of the solutions over another. Personally, I like serving static HTML, as it has the fewest security risks and backups, failover and bringing online a new version of the site are all easily done with the standard shell toolset. Site updates can be made atomic by accessing the document root via a symlink, so a site update means simply moving the symlink.	[reply]
Re: Re: quickest way to access cached data? by Anonymous Monk on May 13, 2004 at 14:49 UTC
Thanks, I never even thought about looking to cache it in RAM. It would be interesting to look into (and a first for me). Basically, I am caching results of search queries. Once the person sees page 1, it can be cached because it won't change so quickly. The first visit to page 2 of course will have to be dynamic, but then can be cached, etc... so if they navigat backwards it should be from cache. People do navigate back quite often as this requires a fair amount of browsing and comparing between pages.	[reply]
Re: Re: Re: quickest way to access cached data? by valdez (Monsignor) on May 13, 2004 at 15:00 UTC
Have a look at memcached: memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load. Ciao, Valerio	[reply]
Re: quickest way to access cached data? by eXile (Priest) on May 13, 2004 at 15:22 UTC
Hi, The previously mentioned Cache::Cache seems a good solution to me escpecially if you don't know what and how you want to cache precisely. Cache::Cache has several backends (FileCache, MemoryCache, SharedMemoryCache), and expiry of cached objects can be set globally or on a per-object basis. With these features you can experiment until you find the right solution for your caching. MemoryCache normally is the fastest of these backends, but at the expense of a lot of memory (duh).	[reply]
Re: quickest way to access cached data? by perrin (Chancellor) on May 13, 2004 at 15:24 UTC
If you only need local access (not shared between machines) I would recommend looking at BerkeleyDB or Cache::FastMmap.	[reply]
Re: quickest way to access cached data? by valdez (Monsignor) on May 13, 2004 at 14:42 UTC
MySql uses files in its backend, so accessing files directly will always be faster; please note that access perfomance to files may degrade on some file systems in presence of a large amount of files stored in the same directory (see the approach used by Cache::Cache); MySql will help you centralize your cache and make it available to many servers. So now, what is your Perl question? :) Ciao, Valerio	[reply]
Re: quickest way to access cached data? by ambrus (Abbot) on May 13, 2004 at 17:53 UTC
A flie system-based solution is good. You just have to take care that there wouldn't be very much files in the same directory, in which case the search would get slow. (You don't have to worry about that if you use a new filesystem like reiserfs, but that has disadvantages too.) The operating system will also cache some of the disk data to the memory, but that's for only a short time. A database has more advantages if the records (in this case the web pages) are smaller (it costs less disk space); or if you have to make more complicated operations with them than just finding one given the name, which you can not do with a filesystem.	[reply]
Re: quickest way to access cached data? by Ryszard (Priest) on May 14, 2004 at 14:33 UTC
At the risk of coming in a little late, I'm pretty much doing the exact same thing you want, with the addition of also using a relational backend. It goes something like this: Check the cache for the information If it doesnt exist in the cache, get it from the database and put it in the cache If it does, get it from the cache and serve it The only thing you have to worry about is tuning the expiry time of your cache for optimal performance. I like the previously mentioned idea of serving everything up from the RAM disk, this will also increase your performance. If you've built your site to generate the pages dynamically, it really is only about 6 extra lines of code to cache it all: `my $cache = Cache::FileCache->new(); my $retcache = get('tvgid'); if (! defined $retcache) { # Build your page # cache content expiry $cache->set('tvgid', $page, "5 minutes"); } else { # Return your page }` [download] Too easy.. :-) You infer from the OP performance is a concern to you, keep in mind there are many, many ways to optimise your code, from regex fiddling to algorithm design to OS tuning to building your own webserver, to application design. Make sure you benchmark your code as well as RW response time to make sure you can quantify your "optimisations"	[reply] [d/l]


more useful options
	PerlMonks