Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

CGI Caching question

by chorg (Monk)
on Apr 22, 2001 at 06:00 UTC ( [id://74506]=perlquestion: print w/replies, xml ) Need Help??

chorg has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing some dynamic CGI content. It thus follows that I don't want the pages cached by anything, not the user, not proxy servers, not anything. So I need to add some headers to my script. Of course I have seen this time and time again, but of course, my brain did not cache the data. :) I do remember that there are more that one though...

What would the complete set of headers needed to totally stop caching?
_______________________________________________
"Intelligence is a tool used achieve goals, however goals are not always chosen wisely..."

Replies are listed 'Best First'.
Re: CGI Caching question
by Masem (Monsignor) on Apr 22, 2001 at 06:13 UTC
    According to the standard, you should have in your HTTP headers an Expires line that should point to negative time to prevent any cache. Fortunately, if you are using CGI.pm, adding -expires=>'-1m' in the header() function will do this without problems. Be warned, of course, that these must be respected by the browser for it to work, and a homegrown browser or someone using something like LWP doesn't have to pay attention to said Expires; you may want to make sure you use session ids to track the user and prevent them from revisiting any part of your script that you are trying to prevent caching of.


    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
      To provide redundancy, use the <META> tag in your HTML. Check here for info:
      http://www.htmlhelp.com/reference/wilbur/head/meta.html
      Will that apply to proxy servers as well? I want to prvent caching on all levels...
      _______________________________________________
      "Intelligence is a tool used achieve goals, however goals are not always chosen wisely..."
        Unfortunately, you can't guarentee that at all; the Expires header is defined in the standard, and therefore, any proxy should follow it, but as with users and homegrown browsers, they don't have to. I know that in recent discussions on my isp's newsgroups on the possibly of installing a proxy, most proxies that are used at a large scale site are home grown and typically have had much trouble with 80% of the web sites out there that *don't* follow the standard.

        Again, falling back on a sessionid and other tracer should help prevent problems from cached page use.


        Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
        I want to prvent caching on all levels...

        Remember, you can't prevent anything. This is the #1 thing that web creators seem to be not able to understand: Once that page leaves your server, it's outta your hands. You don't know who's going to render it, or cache it, or index it, or whatever.

        Basically, my point is, the cache things are suggestions at best. Nobody enforces them, and Lord only knows what mutant and/or broken browsers might be out there.

        xoxo,
        Andy

        # Andy Lester  http://www.petdance.com  AIM:petdance
        %_=split';','.; Perl ;@;st a;m;ker;p;not;o;hac;t;her;y;ju';
        print map $_{$_}, split //,
        'andy@petdance.com'
        
Re: CGI Caching question
by stephen (Priest) on Apr 22, 2001 at 06:55 UTC
    I think that one of the other headers you were thinking of was "Pragma: no-cache". You can generate that one with
    use CGI qw(:standard); print header(-pragma => 'no-cache');
    For more information, you can check out the HTTP specs for 'pragma'.

    stephen

      I've had better luck with "Cache-control: no-cache". It's the HTTP/1.1 version of "Pragma: no-cache". All of the caching proxies I know of support HTTP/1.1.

      If you're truly paranoid, you can use both.

Re: CGI Caching question
by koolade (Pilgrim) on Apr 22, 2001 at 06:51 UTC

    In addition to using the correct HTTP headers, I've had some success in using unique IDs appended to the script URL as the path info. e.g.: http://domain/script.pl/UNIQUE_ID

      koolade's suggestion echoes an approach that has worked well for me. Here's a bit of (slightly verbose) code to illustrate:
      sub make_rand_str { my $num_chars = shift; my @possible_chr_array = @_; my $num_possibles = @possible_chr_array; my $rnd_str = ""; for (my $i=$num_chars; $i--;) { $rnd_str .= $possible_chr_array[rand($num_possibles)]; } return $rnd_str; } my $rnd_str = make_rand_str(4, "A".."Z", "0".."9"); my $url = "http://my/host/cgi/script.pl/$rnd_str",
        Aside from the no-cache pragma, which I find works for 98%+ of folks out there, I find the simplest and easiest way to create a unique string is just a call to time(); e.g.,
        $url = '/cgi-bin/script.pl?time=' . time();
        
        Using both of these things in combination results in 99.9% effectiveness in preventing caching. (Estimates based on a scientific survey created by top scientists which was just recently pulled out of my *.)
(dws)Re: CGI Caching question
by dws (Chancellor) on Apr 22, 2001 at 21:33 UTC
    Just so this gets said...

    If you use a "force content expiration" scheme that relies only on using a unique ID in the URL, you run a risk: If there is a caching proxy between the web server and some browser, that proxy's cache is going to hang on to otherwise expired content (i.e., prior unique IDs and their responses), and will spill valid content sooner, thus reducing the effectiveness of the cache.

    To be friendly to caching proxy servers, use "Cache-control: no-cache" as part of your scheme.

    By the way, caching proxy servers are starting to crop up all over. I'm seeing a lot of them near corporate firewalls, as a means for the corporation to better utilize bandwidth.

Re: CGI Caching question
by RatArsed (Monk) on Jun 27, 2001 at 13:17 UTC
    I've read through the entire thread, and I don't think anyone mentioned the "Cache-Control" header; The as bulletproof as you'll get solution would be with the follow chunk of headers:
    Cache-control: no-cache Pragma: no-cache Expires: (time)
    The time should be expressed in one of the formats mentioned in section 3.3.1 of RFC2068)

    This should catch all variants of agents and intermediaries, however, I expect there are still a few old/badly behaved agents out there...

    --
    RatArsed

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://74506]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-24 01:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found