iic has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
here is my trouble.I writed a server on some port on one
redhat 6.1 to receive a 400k file then save it in local
disk.
always,once had read 13824 bytes from the socket,the job
stopped,no error report.looks like it was suspended forever.
nothing more can be read from the read buffer.
I got several SOL_SOCKET level values:
SO_RCVBUF 65535
SO_SNDBUF 65535
SO_RCVLOWAT 1
SO_SNDLOWAT 1
SO_RCVTIMEO 0
SO_SNDTIMEO 0

my codes below


########################server.pl
#!/usr/bin/perl

$args=@ARGV;
if($args!=2)
{
    print "usage:server.pl [localport] [localfile]\n";
    exit;
}
$port=$ARGV[0];
$file=$ARGV[1];

use IO::Socket;
$skt=IO::Socket::INET->new(LocalAddr=>'0.0.0.0',LocalPort=>$port,Proto
+=>'tcp',Listen=>5)||do{
    print $!."\n";
    exit;
};
$skt->autoflush(1);
print "Ready on port $port ...\n";
$newskt=$skt->accept;
$newskt->sockopt(SO_SNDLOWAT,2048);
$newskt->autoflush(1);
print "client is coming ...\n";
open (FILE,">$file")||die $!;
while($n=sysread($newskt,$buf,2))
{
    $total+=$n;
    print FILE $buf;
    print "received:".$total."\r";
}
close FILE;
$newskt->close();
$skt->close();
print "\nDone.($file)\n";

########################client.pl
#!/usr/bin/perl

$args=@ARGV;
if($args!=3)
{
    print "usage:client.pl [peerhost] [peerport] [localfile]\n";
    exit;
}

$host=$ARGV[0];
$port=$ARGV[1];
$file=$ARGV[2];

use IO::Socket;
$skt=IO::Socket::INET->new(PeerAddr=>$host,PeerPort=>$port,Proto=>'tcp
+')||do{
    print $!."\n";
    exit;
};
print "Begin ...\n";
open (FILE,"$file")||die $!;
binmode FILE;

while($n=read(FILE,$buf,512))
{
    $total+=$n;
    print $skt $buf;
    print "sent:".$total."\r";
}
close FILE;
$skt->close();
print "\nDone.(local $file to remote $host)\n";

server.pl 's output:
Ready on port 8000 ...
client is coming ...
received:13710
ok,wait forever here!

tcpdump results:
<line 1:> a.1077 > b.8000: S 1072025221:1072025221(0) win 16384 <mss 1 +460,nop,nop,sackOK> (DF) <line 2:> b.8000 > a.1077: S 2930713495:2930713495(0) ack 1072025222 w +in 32120 <mss 1460,nop,nop,sackOK> (DF) <line 3:> a.1077 > b.8000: . 1:1(0) ack 1 win 17520 (DF) <line 4:> a.1077 > b.8000: P 1:513(512) ack 1 win 17520 (DF) <line 5:> b.8000 > a.1077: . 1:1(0) ack 513 win 32120 (DF) <line 6:> a.1077 > b.8000: P 513:1973(1460) ack 1 win 17520 (DF) <line 7:> a.1077 > b.8000: P 1973:2049(76) ack 1 win 17520 (DF) <line 8:> b.8000 > a.1077: . 1:1(0) ack 2049 win 32120 (DF) <line 9:> a.1077 > b.8000: P 2049:3509(1460) ack 1 win 17520 (DF) <line 10:> a.1077 > b.8000: P 3509:4969(1460) ack 1 win 17520 (DF) <line 11:> b.8000 > a.1077: . 1:1(0) ack 4969 win 30660 (DF) <line 12:> a.1077 > b.8000: P 4969:6429(1460) ack 1 win 17520 (DF) <line 13:> a.1077 > b.8000: P 6429:7889(1460) ack 1 win 17520 (DF) <line 14:> b.8000 > a.1077: . 1:1(0) ack 7889 win 30660 (DF) <line 15:> a.1077 > b.8000: P 7889:9349(1460) ack 1 win 17520 (DF) <line 16:> a.1077 > b.8000: P 9349:10753(1404) ack 1 win 17520 (DF) <line 17:> b.8000 > a.1077: . 1:1(0) ack 10753 win 29200 (DF) <line 18:> a.1077 > b.8000: P 10753:12213(1460) ack 1 win 17520 (DF) <line 19:> a.1077 > b.8000: P 12213:13313(1100) ack 1 win 17520 (DF) <line 20:> b.8000 > a.1077: . 1:1(0) ack 13313 win 27740 (DF) <line 21:> a.1077 > b.8000: P 14773:14849(76) ack 1 win 17520 (DF) <line 22:> b.8000 > a.1077: . 1:1(0) ack 13313 win 27740 <nop,nop, sac +k 1 {14773:14849} > (DF) <line 23:> a.1077 > b.8000: P 14849:16309(1460) ack 1 win 17520 (DF) <line 24:> b.8000 > a.1077: . 1:1(0) ack 13313 win 27740 <nop,nop, sac +k 1 {14773:16309} > (DF) <line 25:> a.1077 > b.8000: P 16309:16385(76) ack 1 win 17520 (DF) <line 26:> b.8000 > a.1077: . 1:1(0) ack 13313 win 27740 <nop,nop, sac +k 1 {14773:16385} > (DF) <line 27:> a.1077 > b.8000: P 16385:17845(1460) ack 1 win 17520 (DF) <line 28:> b.8000 > a.1077: . 1:1(0) ack 13313 win 27740 <nop,nop, sac +k 1 {14773:17845} > (DF) <line 29:> a.1077 > b.8000: P 17845:17921(76) ack 1 win 17520 (DF) <line 30:> b.8000 > a.1077: . 1:1(0) ack 13313 win 27740 <nop,nop, sac +k 1 {14773:17921} > (DF) <line 31:> a.1077 > b.8000: P 19381:19457(76) ack 1 win 17520 (DF) <line 32:> b.8000 > a.1077: . 1:1(0) ack 13313 win 27740 <nop,nop, sac +k 2 {19381:19457}{14773:17921} > (DF) <line 33:> a.1077 > b.8000: P 19457:20917(1460) ack 1 win 17520 (DF) <line 34:> b.8000 > a.1077: . 1:1(0) ack 13313 win 27740 <nop,nop, sac +k 2 {19381:20917}{14773:17921} > (DF) <line 35:> a.1077 > b.8000: P 20917:20993(76) ack 1 win 17520 (DF) <line 36:> b.8000 > a.1077: . 1:1(0) ack 13313 win 27740 <nop,nop, sac +k 2 {19381:20993}{14773:17921} > (DF) <line 37:> a.1077 > b.8000: P 20993:22017(1024) ack 1 win 17520 (DF) <line 38:> b.8000 > a.1077: . 1:1(0) ack 13313 win 27740 <nop,nop, sac +k 2 {19381:22017}{14773:17921} > (DF) <line 39:> a.1077 > b.8000: P 13313:13825(512) ack 1 win 17520 (DF) <line 40:> b.8000 > a.1077: . 1:1(0) ack 13825 win 27228 <nop,nop, sac +k 2 {19381:22017}{14773:17921} > (DF)

Please look into the line 21,it should have started at 13313
but 14773.Host a's mistake?So I tried to use host c
send file to b,But Get the same results with from host a to b!

What is possible reason of that?kernel restriction?
I know this is not a pure perl question.But I hope ...
It puzzle me for a long time.
Thanks for any help or hint in advance!

Replies are listed 'Best First'.
(tye)Re: network debug
by tye (Sage) on Jun 14, 2001 at 19:46 UTC

    Sorry, I don't see what the solution is. The TCP dump stops just when it starts to get interesting.

    Line 21 that you mention makes it look like several packets were lost and the sender kept sending "the next" packet hoping that the lost packets were just delayed and would eventually show up. After a while, the sender decides that it has been too long (more precisely, that the receiver's "window" for buffering out-of-order packets can't hold the next packet) and starts sending the lost packets. Right after sending the first lost packet and the reciever acknowledging it, the trace stops. But this is probably because you don't get any more packets for a while since this also looks like the place where the script hangs.

    Other than the trace stopping, this is all stuff that I expect from TCP. I'm curious about the nature of the network connection between the two machines. If it is always stopping in the same place, then something is consistantly preventing those "lost" packets from getting through.

    You may want to use TCP dump on both ends of the connection and see how they differ to narrow down where the packets are being lost. You'll probably find some packets that are not displayed in the trace above (and vice versa).

    Also realize that TCP will usually wait 2 minutes for a lost packet so it can take 4 to 8 minutes before everything times out. You obviously haven't waited "forever" ;), but have you waited at least 8 minutes?

            - tye (but my friends call me "Tye")
      At first,Thanks for your reply!
      This time,I waitted for a long time.A change taked place.
      server.pl was run on Linux,client.pl was run on Win2k.
      They rapidly hung,

      status now:
      E:\>perl client.pl 202.39.48.62 8000 socks5-v1.0r11.tar.gz Begin ... sent:21504 #perl server.pl 8000 tmpfile Ready on port 8000 ... client is coming ... received:13300

      about 10 minutes later,suddenly the sender loss his patience
      ,run up to the done directly.(the file's size is 404020)

      status now:
      E:\>perl client.pl 202.39.48.62 8000 socks5-v1.0r11.tar.gz Begin ... sent:404020 Done.(local socks5-v1.0r11.tar.gz to remote 202.39.48.62) #perl server.pl 8000 tmpfile Ready on port 8000 ... client is coming ... received:13300

      The receiver hold this pose at least 30 minutes,
      so I have to ctrl+c.

      tcpdump result:
      <line 1:> a.2726 > b.8000: S 3628356093:3628356093(0) win 16384 <mss 1 +460,nop,nop,sackOK> (DF) <line 2:> b.8000 > a.2726: S 2575239810:2575239810(0) ack 3628356094 w +in 32120 <mss 1460,nop,nop,sackOK> (DF) <line 3:> a.2726 > b.8000: . 1:1(0) ack 1 win 17520 (DF) <line 4:> a.2726 > b.8000: P 1:513(512) ack 1 win 17520 (DF) <line 5:> b.8000 > a.2726: . 1:1(0) ack 513 win 32120 (DF) <line 6:> a.2726 > b.8000: P 513:1973(1460) ack 1 win 17520 (DF) <line 7:> a.2726 > b.8000: P 1973:2049(76) ack 1 win 17520 (DF) <line 8:> b.8000 > a.2726: . 1:1(0) ack 2049 win 32120 (DF) <line 9:> a.2726 > b.8000: P 2049:3509(1460) ack 1 win 17520 (DF) <line 10:> b.8000 > a.2726: . 1:1(0) ack 3509 win 32120 (DF) <line 11:> a.2726 > b.8000: P 3509:4969(1460) ack 1 win 17520 (DF) <line 12:> b.8000 > a.2726: . 1:1(0) ack 4969 win 32120 (DF) <line 13:> a.2726 > b.8000: P 4969:6429(1460) ack 1 win 17520 (DF) <line 14:> a.2726 > b.8000: P 6429:7889(1460) ack 1 win 17520 (DF) <line 15:> b.8000 > a.2726: . 1:1(0) ack 7889 win 30660 (DF) <line 16:> a.2726 > b.8000: P 7889:9349(1460) ack 1 win 17520 (DF) <line 17:> b.8000 > a.2726: . 1:1(0) ack 9349 win 30660 (DF) <line 18:> a.2726 > b.8000: P 10753:11777(1024) ack 1 win 17520 (DF) <line 19:> b.8000 > a.2726: . 1:1(0) ack 9349 win 30660 <nop,nop, sack + 1 {10753:11777} > (DF) <line 20:> a.2726 > b.8000: P 11777:13237(1460) ack 1 win 17520 (DF) <line 21:> b.8000 > a.2726: . 1:1(0) ack 9349 win 30660 <nop,nop, sack + 1 {10753:13237} > (DF) <line 22:> a.2726 > b.8000: P 13237:13313(76) ack 1 win 17520 (DF) <line 23:> b.8000 > a.2726: . 1:1(0) ack 9349 win 30660 <nop,nop, sack + 1 {10753:13313} > (DF) <line 24:> a.2726 > b.8000: P 14773:14849(76) ack 1 win 17520 (DF) <line 25:> b.8000 > a.2726: . 1:1(0) ack 9349 win 30660 <nop,nop, sack + 2 {14773:14849}{10753:13313} > (DF) <line 26:> a.2726 > b.8000: P 14849:16309(1460) ack 1 win 17520 (DF) <line 27:> b.8000 > a.2726: . 1:1(0) ack 9349 win 30660 <nop,nop, sack + 2 {14773:16309}{10753:13313} > (DF) <line 28:> a.2726 > b.8000: P 16309:16385(76) ack 1 win 17520 (DF) <line 29:> b.8000 > a.2726: . 1:1(0) ack 9349 win 30660 <nop,nop, sack + 2 {14773:16385}{10753:13313} > (DF) <line 30:> a.2726 > b.8000: P 17845:17921(76) ack 1 win 17520 (DF) <line 31:> b.8000 > a.2726: . 1:1(0) ack 9349 win 30660 <nop,nop, sack + 3 {17845:17921}{14773:16385}{10753:13313} > (DF) <line 32:> a.2726 > b.8000: P 9349:9729(380) ack 1 win 17520 (DF) <line 33:> b.8000 > a.2726: . 1:1(0) ack 9729 win 30280 <nop,nop, sack + 3 {17845:17921}{14773:16385}{10753:13313} > (DF) <line 34:> a.2726 > b.8000: P 9729:11189(1460) ack 1 win 17520 (DF) <line 35:> b.8000 > a.2726: . 1:1(0) ack 13313 win 29200 <nop,nop, sac +k 2 {17845:17921}{14773:16385} > (DF) <line 36:> a.2726 > b.8000: P 14773:16233(1460) ack 1 win 17520 (DF) <line 37:> b.8000 > a.2726: . 1:1(0) ack 13313 win 30660 <nop,nop, sac +k 2 {17845:17921}{14773:16385} > (DF) <line 38:> b.8000 > a.2726: F 1:1(0) ack 13313 win 30660 <nop,nop, sac +k 2 {17845:17921}{14773:16385} > (DF) <line 39:> a.2726 > b.8000: R 3628369406:3628369406(0) win 0


      My time is 2:20 am now.I can not go into the results
      carefully.May I send you email?see you next day:)