Re: Creating a CGI Caching Proxy

I wrote an app similar (actually nearly identical) to this a while back. The approach I took was to have my script check if the item was in the cache, and if not fork off a process to download the file. Then the main script reads the cache file, and feeds it to the Web browser as it comes in. I used some kind of marker to tell if the file was fully or partially downloaded; I think I used the executable bit and file locks for this.

This solves several problems. First, there's very little latency. Second, it solves the problem of what to do if two browsers try to access the file at once. Third, it solves the problem of what to do if the user presses stop.

Here's some pseudocode, which is probably clearer:

my $url = $cgi->param('url');
my $file = url2file($file);
my $fh = FileHandle->new("< $file");
if ($fh)
{
  # Executable means it's a partial download
  if (-x F)
  {
    # If it's not locked, the download process has died
    if (flock(F, LOCK_EX|LOCK_NB))
    {
      $fh=get($file,$url)
        or die "Couldn't get URL!\n";
    }
    stream($fh);
  }
  stream($fh);
}
else
{
  $fh=get($file,$url)
    or die "Couldn't get URL!\n";
  stream($fh);
}

sub get
{
  # Open the filehandle for read and write, 
  # Lock filehandle for write,
  # fork off process to start the download.
  #   Child Process: Download URL,
  #                  Set +x bit when done
  # Return dup of filehandle
}

sub stream
{
  # Keep streaming data from filehandle until
  # the executable bit is set.  Works pretty much like
  # tail -f.
}
[download]

Comment on Re: Creating a CGI Caching Proxy Select or Download Code