Hi,

I have an interesting issue. The code that follows has been in testing for a while and uses DID use read() to read data from a socket connection to a web server. The URL noted led to an interesting problem with the read hanging at 97%. Changing to sysread solved the problem. This is the first URL to display this behaviour.

Questions?

  1. Why is the read broken? Broken on both Win32 and Linux for this particular download (only currently known test case but no doubt there are many others). It appears likely that the missing data is simply being buffered somewhere.
  2. The server on the end of the socket claims to be IIS/5.0
  3. In the docs there is a note that sysread() bypasses stdio, so mixing this with other kinds of reads, print, write, seek, tell, or eof can cause confusion because stdio usually buffers data.
  4. Does this mean I should change print $fh to syswrite()? Internally LWP::Simple happily intermixes print and sysread....

Why might this be important?

Both CGI and CGI::Simple use read() to get POST data. There is a transient, difficult to prove with a reliable test case, issue with both Modules and some Browsers in certain circumstances - Namely with large POSTs (not multipart form) sometimes all the expected data fails to be got by the read call. read() is blocking and should get all the data you asked for (if sent). The threshold for large appears to be ~20K ? 16384. You can kill read with signals due to the non-reentrant behaviour of the old C libs but getting it to return with a short read any other way with a test case has proved problematic. Wisdom appreciated.

#!/usr/bin/perl -w use strict; use IO::Socket::INET; $|++; my $url = "http://ftp.blizzard.com/pub/war3/maps/(4)iceforge.zip"; my $DEBUG = 1; my $CRLF = "\015\012\015\012"; my ( $code, $type, $length, $sock, $data_buffer, $location ) = init_do +wnload( $url ); open my $fh, '>c:/tmp.zip' or die $!; binmode $fh; print $fh $data_buffer; download( $fh, $sock, $filename, length($data_buffer), $length ); sub download { my ( $fh, $sock, $filename, $got_so_far, $length ) = @_; my $buffer; print "Got: $got_so_far\n" if $DEBUG; # # This will hang on a read() works with sysread() # while ( ($got_so_far < $length) and sysread( $sock, $buffer, 8192 +) ){ print $fh $buffer; $got_so_far += length $buffer; print "Got: $got_so_far\n" if $DEBUG; #write_lockfile( $filename, $got_so_far ); } close $fh; $sock->close; print "Wanted: $length\nGot $got_so_far\n"; unless ( $length == $got_so_far ) { die "Expected $length bytes but only got $got_so_far" ; } } sub init_download { my ( $url ) = @_; ui_network_error( "Invalid URL $url\n" ) unless $url =~ m!^http:// +([^/:\@]+)(?::(\d+))?(/\S*)?$!; my $host = $1; my $port = $2 || 80; my $path = $3; $path = "/" unless defined $path; my $sock = IO::Socket::INET->new( PeerAddr => $host, Proto => 'tcp +', PeerPort => $port ) or ui_network_error( 'Could not connect socket', $url ); $sock->autoflush; print $sock "GET $url HTTP/1.0 Host: localhost Accept: */* Connection: Keep-Alive User-Agent: Mozilla/4.0 (compatible; MSIE 4.5; Windows 98; ) $CRLF"; my ($header, $content, $buffer); while (sysread( $sock, $buffer, 8192 )){ $content .= $buffer; if ( (my $index = (index $content, $CRLF)) > 0 ) { $header = substr $content, 0, $index; $content = substr $content, $index+ 4; last; } } $header =~ s/\015\012/\n/g; # unfold the header $header =~ s/\n\s+/ /g; my ($length) = $header =~ m/^Content-Length:\s*(\d+)/im; my ($type) = $header =~ m/^Content-Type:\s*([^\r\n]+)/im; my ($loc) = $header =~ m/^Location:\s*([^\r\n]+)/im; my ($code) = $header =~ m!^HTTP/\d\.\d[^\d]+(\d+)!i; print "$header\n----\nWant: $length\n"; return ( $code, $type, $length, $sock, $content, $loc ) } sub ui_network_error{ die shift }

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print


In reply to read aka fread(3) broken, sysread aka read(2) works IIS socket by tachyon

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.