Guildencrantz has asked for the wisdom of the Perl Monks concerning the following question:

I have a script which has been working wonderfully for about a week now while I have been developing around it. Unfortunately it is now hanging. The script is as follows:

#!/usr/bin/perl -wT use strict; use IO::Socket; use HTML::Entities; require './libraryCommon.pl'; unless (@ARGV) { die "Exiting without parameter. (HINT: You need to p +ass the I\ SBN)\n"; } unless (testISBN($ARGV[0])) { my $host="xml.amazon.com"; my $port=80; my $buff=""; my $line; my $title=""; my $date=""; my $manufacturer=""; my $author=""; my $count=0; my $isbn=$ARGV[0]; my $getBook="GET http://$host/onca/xml2?t=webservices-20&dev-t=D3N +1ICFCFE4DHV&AsinSearch=$isbn&type=lite&f=xml HTTP/1.0\n\n"; my $sock=new IO::Socket::INET(PeerAddr => $host, PeerPort => $port +, Proto => 'tcp') or die "Couldn't connect to $host"; $sock->autoflush(1); print $sock $getBook; $count=0; print <$sock>; while (<$sock>) { print "looping. \n"; decode_entities($_); print $_, "\n"; if (m%<ProductName>(.*?)</ProductName>%) { $title = $1 } elsif (m%<Author>(.*?)</Author>%) { $author[$count]=$1; $count++; } elsif (m%<ReleaseDate>.*, (\d\d\d\d)</ReleaseDate>%) { $date=$ +1; } elsif (m%<Manufacturer>(.*?)<\/Manufacturer>%) { $manufacturer +=$1; } } close($sock); print $title, "\n"; foreach $author (@author) { print $author, "\n" }; print $date, "\n"; print $manufacturer, "\n"; } else { die "Exiting due to the fact that you have supplied an invalid +ISBN."; }

I have tested the code through and everything is fine except the "while(<$sock>){}" loop. If I remove this loop the script simple runs to completion. With this loop in place the script hangs. Any print statements that I put in the loop are not displayed.

I have tried directly connecting to: here, which does seem to produce a lovely page of XML information about the book requested.

Why would my script hang while trying to read this page? Any suggestions?

~~Guildencrantz

Replies are listed 'Best First'.
Re: Reading an XML page
by BrowserUk (Patriarch) on May 28, 2003 at 04:33 UTC

    Without having tried your script (and therefore guessing), but what happens if you comment out the print <$sock>; line just above the while loop?

    I suspect that the print is 'draining' the socket, and by the time you reach the while loop, there is nothing left to read, so it hangs waiting for input which never arrives?


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


      If only it were that simple. I actually posted an incorrect version of this script, my appologies. The script that I should have posted follows: (I had placed a number of print statements through the previous post which I had used to try and find the problem)

      #!/usr/bin/perl -wT use strict; use IO::Socket; use HTML::Entities; require './libraryCommon.pl'; unless (@ARGV) { die "Exiting without parameter. (HINT: You need to p +ass the ISBN)\n"; } unless (testISBN($ARGV[0])) { my $host="xml.amazon.com"; my $port=80; my $buff=""; my $line; my $title=""; my $date=""; my $manufacturer=""; my $author=""; my @author; my $count=0; my $isbn=$ARGV[0]; my $getBook="GET http://$host/onca/xml2?t=webservices-20&dev-t=D3N +1ICFCFE4DHV&AsinSearch=$isbn&type=lite&f=xml HTTP/1.0\ \n\n"; my $sock=new IO::Socket::INET(PeerAddr => $host, PeerPort => $port +, Proto => 'tcp') or die "Couldn't connect to $host"; $sock->autoflush(1); print $sock $getBook; $count=0; while (<$sock>) { decode_entities($_); if (m%<ProductName>(.*?)</ProductName>%) { $title = $1 } elsif (m%<Author>(.*?)</Author>%) { $author[$count]=$1; $count++; } elsif (m%<ReleaseDate>.*, (\d\d\d\d)</ReleaseDate>%) { $date=$ +1; } elsif (m%<Manufacturer>(.*?)<\/Manufacturer>%) { $manufacturer +=$1; } } close($sock); print $title, "\n"; foreach $author (@author) { print $author, "\n" }; print $date, "\n"; print $manufacturer, "\n"; } else { die "Exiting due to the fact that you have supplied an invalid +ISBN."; }

        Not sure if it will help you, but LWP::Simple seems to work ok.

        D:\Perl\test>perl58 -mLWP::Simple=getprint -e" getprint 'http://xml.am +azon.com/onca/xml2?t=webservices-20&dev-t=D3N1ICFCFE4DHV&AsinSearch=1 +565924193&type=lite&f=xml'" <?xml version="1.0" encoding="UTF-8"?><ProductInfo xmlns:xsi="http://w +ww.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="htt +p://xml.amazon.com/schemas2/dev-lite.xsd"> <Details url="http://www.amazon.com/exec/obidos/redirect?tag=webser +vices-20%26creative=D3N1ICFCFE4DHV%26camp=2025%26link_code=xm2%26path +=ASIN/1565924193"> <Asin>1565924193</Asin> <ProductName>CGI Programming with Perl</ProductName> <Catalog>Book</Catalog> <Authors> <Author>Gunther Birznieks</Author> <Author>Scott Guelich</Author> <Author>Shishir Gundavaram</Author> </Authors> <ReleaseDate>15 January, 2000</ReleaseDate> <Manufacturer>O'Reilly &amp; Associates</Manufacturer> <ImageUrlSmall>http://images.amazon.com/images/P/1565924193.01.T +HUMBZZZ.jpg</ImageUrlSmall> <ImageUrlMedium>http://images.amazon.com/images/P/1565924193.01. +MZZZZZZZ.jpg</ImageUrlMedium> <ImageUrlLarge>http://images.amazon.com/images/P/1565924193.01.L +ZZZZZZZ.jpg</ImageUrlLarge> <ListPrice>$34.95</ListPrice> <OurPrice>$24.47</OurPrice> <UsedPrice>$13.99</UsedPrice> </Details> </ProductInfo>

        I tried the latest version of your script you posted, and it too hung on the read from the socket. I can't see why either.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller