in reply to Re^2: [OT] HTTP downloads and caching
in thread [OT] HTTP downloads and caching

That all three machines are affected in the same way leads me to think that it's my ISP that's throwing the spanner into the works. (My next step will probably be to see what my ISP thinks of that hypothesis :-)

That sounds like the obvious conclusion to me, too. You could always check whether it is a caching issue (either at the ISP or elsewhere) by appending a query string to the URL. Or indeed by using an https URL if one is available for that resource.

Good luck with your investigations.

Replies are listed 'Best First'.
Re^4: [OT] HTTP downloads and caching
by syphilis (Archbishop) on Oct 08, 2015 at 03:29 UTC
    You could always check whether it is a caching issue (either at the ISP or elsewhere) by appending a query string to the URL

    I don't think I would *ever* have thought of that. What good thinking !!

    So ... when I append a (random) query string to the URL I get the "new" file, but when I omit the query string I get the "old" file.
    This surely demonstrates that it's a caching issue, but is there a way for me to pinpoint the location of this cache ?

    Afterthought: If wget is accessing a cache on the local linux machine, where would that cache be located ?

    Cheers,
    Rob

      Caching is a good possibility worth investigating but the query string may or may not affect a serverside or middleware/service cache; their heuristics for what constitutes a new or different request are up to them.

      is there a way for me to pinpoint the location of this cache ?

      Not if it's outside your control. If you run your favourite packet sniffer at the border of your network (ie. as far up the chain as you have access) that should show whether the cache is outside or not.

      If wget is accessing a cache on the local linux machine, where would that cache be located ?

      I'm not familiar enough with wget to comment, but I think that's highly unlikely anyway given you've been trying with different user agents on different client machines. A quick google shows that wget has a --cache option which can be set to "off" to bypass upstream proxies.

        A quick google shows that wget has a --cache option which can be set to "off" to bypass upstream proxies.

        Yes, the "--no-cache" option *does* allow me to wget the current (correct) version of the file.
        Not only that, but it apparently also clears (or updates) the upstream cache, and the "--no-cache" option becomes no longer needed.
        Furthermore, having run "wget --no-cache ..." on the Ubuntu machine, the other machines immediately started downloading the correct file.
        I consider that to be proof that the cache was definitely upstream of me.

        I still don't know precisely where that cache is, but I've sent off an email to my ISP where I accuse them of sending me outdated files that I don't want (and still charging me for the download).
        No response from the turds, yet.

        I thought I had sent a post about this last night ... but I don't see it now. Presumably I didn't get around to hitting "send".
        I apologise for that.

        Thanks to all who responded !!

        Cheers,
        Rob