capa has asked for the wisdom of the Perl Monks concerning the following question:

I need modified time for a web page, and i tried head() function from LWP::Simple interface. I wrote the following test script
#!/usr/bin/perl use LWP::Simple; ($content_type, $document_length, $modified_time, $expires, $server) = + head $ARGV0; printf "$content_type <> $document_length <> $modified_time <> $expire +s <> $server\n";
and then I try it:
bash$ ./test_head.pl http://localhost/tmp text/html <> <> Apache/1.3.9 (Unix) Debian/GNU PHP/3.0.14 <> <> bash$ ./test_head.pl http://localhost/ text/html <> 274 <> 953031425 <> Apache/1.3.9 (Unix) Debian/GNU PHP/3. +0.14 <> bash$ ./test_head.pl http://localhost/index.html text/html <> 274 <> 953031425 <> Apache/1.3.9 (Unix) Debian/GNU PHP/3. +0.14 <>
As you can see server is not the last item of list returned by head, in fact it has no fixed position in that list.
What's wrong?
Thanks.

Replies are listed 'Best First'.
Re: LWP::Simple
by chromatic (Archbishop) on Mar 15, 2000 at 02:44 UTC
    Odd. Doing this:
    #!/usr/bin/perl -w use strict; use LWP::Simple; my @list; foreach (head $ARGV[0]) { push @list, $_; print "Undefined\n" unless defined $_; } print scalar @list, "\n"; my ($content_type, $document_length, $modified_time, $expires, $server +) = @list; print ">>$content_type<< "; print ">>$document_length<< "; print ">>$modified_time<< "; print ">>$expires<< "; print ">>$server<<\n";
    doesn't give me any undefineds, and it shows the length of @list as 4 items. There's no expiration set, at least for the web pages I tried. I would call this a bug in the documentation or the behavior of the module (as I tried it for IIS and Apache).
      well... look at this:
      bash:~$ ./1.pl http://localhost/tmp/
      Undefined
      3
      Use of uninitialized value at ./1.pl line 17.
      Use of uninitialized value at ./1.pl line 19.
      Use of uninitialized value at ./1.pl line 20.
      >>text/html<< >><< >>Apache/1.3.9 (Unix) Debian/GNU PHP/3.0.14<< >><< >><<
      
      note that this url produce an index of tmp/ directory.

      and after run your script (modified):
      #!/usr/bin/perl -w
      use strict;
      use LWP::Simple;
      
      my @list;
      my $i = 0;
      foreach (head $ARGV[0]) {
          $i++;
          push @list, $_;
          print "$i: Undefined\n" unless defined $_;
      }
      
      print scalar @list, "\n";
      my ($content_type, $document_length, $modified_time, $expires, $server) = @list;
      print ">>$content_type<< ";
      print ">>$document_length<< ";
      print ">>$modified_time<< ";
      print ">>$expires<< ";
      print ">>$server<<\n";
      


      get this output:
      bash:~$ ./1.pl http://localhost/tmp/
      2: Undefined
      3
      Use of uninitialized value at ./1.pl line 19.
      Use of uninitialized value at ./1.pl line 21.
      Use of uninitialized value at ./1.pl line 22.
      >>text/html<< >><< >>Apache/1.3.9 (Unix) Debian/GNU PHP/3.0.14<< >><< >><<
      
      So... it is a bug?
      Thanks.
Re: LWP::Simple
by Anonymous Monk on Mar 14, 2000 at 23:49 UTC
    The results were the similar through my tests, so I did a little digging, and found that the problem seemed to come when there was no expires time. The expires time gets returned in the form
    HTTP::Date::str2time($response->header('Expires'))
    Taking a quick look in Date.pm, near the top of the str2time function is the line
    return unless defined $str;
    Changing this line to instead read
    return undef unless defined $str;
    seems to produce the desired result, but I don't know enough offhand about this module to know whether that would affect any other desired behavior.
      I seem to keep losing my cookie somehow. Anyways, I posted the above, so direct any responses to me.
        Is not just from expires field, also when content_length and modified_time are missing the result are "corrupted".
        Please take a more carefully look at the run output that i posted:
        The idea is to obtain something like:
        
        content_type <> content_length <> last_modified <> expires <> server
        
        and i never obtained server as last field (after last <> separator),
        the position of server was 3 and, respectively, 4 (never 5 :()(indexing
        from 1)
        
        All that three requested urls has no expires, and first
        was a directory listing (meaning no content_length and modified_time)
        

        Thanks.