I am working on a project which is running under a threaded mod_perl. All hits to the site go through the same script, much like PerlMonks with its index.pl. Now, each hit to this script needs to extract a certain amount of data from a file specified by the 'node' (to use PerlMonks analogy) in order to display a completed page to the browser.
The way I had coded it at first was as follows (I just faked a regex for locating the needed extraction; the real one is quite a bit more complex):
# <SNIP> - handle some pre-processing # $FILE contains a safe path to an existing file on # the filesystem, based on browser input. open( my $fh, '<', $FILE ) or die "open failed: $!"; my $file = do { local $/; <$fh> }; close( $fh ); my ($needed) = $file =~ /\A<!\-\-(.*?)\-\->/s; # now output the document, using the $needed info.
Then it struck me that this will be running under mod_perl and I thought about caching the needed extractions from the files into memory. So I rewrote it something like this:
BEGIN { use vars qw($NEEDED); } # <SNIP> as in first example. unless (exists $NEEDED{$FILE}) { open( my $fh, '<', $FILE ) or die "open failed: $!"; my $file = do { local $/; <$fh> }; close( $fh ); ($NEEDED{$FILE}) = $file =~ /\A<!\-\-(.*?)\-\->/s; } # now output the document, using the $NEEDED{$FILE} info.
The files from which the required information is being extracted from aren't all that large, so it seems to me that slurping the entire file contents on each hit wouldn't be too much of a burden. My basic question is whether such simple file I/O on not-too-large files would add up to many CPU cycles under heavy load. By using the caching, I am also hitting the HDD far less often. Again, is it enough to deserve a caching mechanism? Am I worrying too much by caching the extracted pieces or am I being smart?
NB: I figure that the hash method for caching is good enough in this case as it is a threaded mod_perl, so the caching won't be saved individually for each Apache process (as there is only one). As such, I wonder if perhaps using Cache::SharedMemoryCache so that there won't be an additional burder should it ever be moved to a non-threaded mod_perl.
In reply to Repetitive File I/O vs. Memory Caching by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |