souravdas has asked for the wisdom of the Perl Monks concerning the following question:

my %cache; sub first_line { my $filename = shift; return $cache{$filename} if exists $cache{$filename} open FILE,$filename or return ""; my $line=<FILE>; $cache{$filename}=$line; return $line; }
Is this code correct? If so, how do we execute this program and how would we know that this code is doing caching activity?

janitored by ybiC: Retitle from less-than-descriptive "Hello Monks!!!!" to improve site search results

Replies are listed 'Best First'.
Re: Cache Subroutine Return Value
by dragonchild (Archbishop) on Aug 11, 2004 at 14:46 UTC
    The code looks correct. On a stylistic note, I would write it slightly differently.
    my %cache; sub first_line { my ($filename) = @_; unless (exists $cache{$filename}) { open( my $fh, $filename ) or die "Cannot open '$filename': $!\ +n"; $cache{$filename} = <$fh>; } return $cache{$filename}; }

    Instead of checking for the existence of the cached value, I check for the non-existence of the cached value. If it doesn't exist, I populate it. That way, I always return the cached value and the logic is slightly easier to work with.

    On a sidenote, I would look at Memoize. It's a module that's built to do exactly what you're trying to do.

    Update: Changed return to die as per jdporter's comment that reading an empty file gives back undef.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

      I would make the following additional change: If the open() fails, throw an exception (die), rather than returning a value which is indistinguishable from a valid first line. (An empty string is what you'd get if the file contains 0 bytes.)
        Huh. Never knew that. It actually returns undef, not the empty string, which is what return; will send back to the caller. Updated.

        ------
        We are the carpenters and bricklayers of the Information Age.

        Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

        I shouldn't have to say this, but any code, unless otherwise stated, is untested

Re: Cache Subroutine Return Value
by Roger (Parson) on Aug 11, 2004 at 16:02 UTC
    I'd say that the code is not correct (missing a ; on the if exists line) and not clean enough because it leaves a lot of open file handles. Although that might be correct if you intend to keep the file open. When you have too many files in your cache, you will eventually run out of file handles and your script will fail. Throw in a close FILE before the return to free up the file handle.

    I would recommend using the IO::File module whenever doing file IO.

    my %cache; sub first_line { my $filename = shift; return $cache{$filename} if exists $cache{$filename}; my $f = new IO::File $filename, "r" or die "file not found"; my $line = <$f>; $cache{$filename} = $line; return $line; }

    Note that I did not explicitly close the file handle, because when the sub finishes, the object $f will get out of scope and automatically disposed of, which invokes the destructor of the file object, which closes the file implicitly. This could be quite handy when working with files, because you don't have to track the files that you have openned and you don't have to close them explicitly.

Re: Cache Subroutine Return Value
by graff (Chancellor) on Aug 12, 2004 at 01:53 UTC
    ... how would we know that this code is doing caching activity?

    I've seen some cases where people take the trouble to provide a report on caching effectiveness -- basically, you want to know how many times "first_line" was called, and of those times, how often you hit on a cached value.

    That sort of thing is easy to add -- here's how it might look, adding to dragonchild's proposal, which is very clear and clean:

    my %cache; my ($ncalls, $nhits); sub first_line { my ($filename) = @_; $ncalls++; if (exists $cache{$filename}) { $nhits++; } else { open( my $fh, $filename ) or die "Cannot open '$filename': $!\ +n"; $cache{$filename} = <$fh>; } return $cache{$filename}; } END { print "first_line returned cached value $nhits times in $ncalls +calls\n" }
    Usually you'd want the report to be conditional on a run-time option -- e.g. the user includes a "-v" or "-d" flag when running the script from a command line to select "verbose" or "debug" mode.
Re: Cache Subroutine Return Value
by adrianh (Chancellor) on Aug 12, 2004 at 09:35 UTC