http://qs1969.pair.com?node_id=283977

swkronenfeld has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

Problem description: My company has some revision controlled documents that they store with an increasing file letter scheme (beginning with dash), i.e.

2222_-.doc 2222_A.doc 2222_B.doc 2222_C.doc etc.

These documents are linked from the web, but they don't want to have to go in and change the HTML every time a document revision changes. So I wrote a CGI script which would get the path to the base file (i.e. 2222_.doc) as an input, and determine the correct revision and redirect the user. I originally wrote this using Net::FTP, and it was working, but then the web server crashed. The sysadmin had some problems with the Apache .netrc file (which was storing the login/password to the documents server), and he doesn't have the time to fix this.

So I have to rewrite without FTP access. I went to CPAN and looked up HTTP::Request, and went from there. My code is working at the moment, but it has to download the document twice in order for the user to view it. Once on the server to make sure it's valid, and again for the user to download it. Some of these files are rather large, so this seems like a waste of time. But even sending the user the direct output from the $ua->request() call won't save much time.

End result: I'm looking for a way to see if a link is valid without downloading the content at that link.

I found this script on CPAN which will print just the returned headers, but it still downloads the whole page before printing it. This is leading me to believe that this may not be possible? I guess this question is more of an HTTP question than a strictly Perl one, but since there are so many modules out there, I thought someone could shove me in the right direction if I'm missing something.

Here is my code:

#!/sw/local/bin/perl -Tw use strict; use HTTP::Request; use LWP::UserAgent; print "Content-type: text/html\n\n"; print "<html>\n"; my $path = $ENV{'QUERY_STRING'}; if(!$path) { dienice("Must pass in at least 1 argument.") } my $file; if($path =~ s:/([^/]+)$::) { $file = $1 } else { dienice("Incorrectly formatted path : $path") } my $ext; #file extension if($file =~ s/\.(.+)$//) { $ext = $1 } else { dienice("Oncorrectly formatted filename: $file") } my $ua = LWP::UserAgent->new; my $tmpfile; foreach my $rev ("-", ('A'..'Z')) { $tmpfile = "$file$rev.$ext"; my $request = HTTP::Request->new(GET => "$path/$tmpfile"); my $response = $ua->request($request); print "$response->{_msg}<br>"; last if($response->{_msg} eq "OK"); } print "<head><meta http-equiv=Refresh content=\"10; URL=$path/$tmpfile +\"></head><body>"; print "<a href=\"$path/$tmpfile\">Please click here if you are not aut +omatically redirected</a><br>\n\ "; print "<p>Due to security measures, you will only be able to access fi +les in /QUALITY/DocConSys in an\ account that has access to this folder.<br>"; print "</body></html>"; exit; sub dienice { print "<body><h2>Error:</h2>$_[0]</body></html>"; exit; }

btw, any other comments on my code or method are welcome. Thanks for your help