I've had a go a restructuring your code, and simplifying some of the logic.

Whilst I have tested most of this in individual parts and it compiles clean with strict and warnings, I don't have a set of proxies available, nor could I find a convenient server doing push. So, this is effectively untested code.

I believe that by using a buffer of 5000 chars, you could well be receiving more than on page at a time. This will be dependant upon the size of the pages and the rate at which the server chooses to push them. Without sight of the content you are receiving, it not possible to do much more as both the size and the rate could be variable between servers or even between pushes from the same server.

With better information, probably easily obtainable by just watching the pages in a browser with a stop watch, you could probably adjust the UA->timeout and UA->max_size parameters, in concert with the sleep controlling the main loop to ensure that you don't miss changes in the pages. Only worth while if that is an important objective.

As mentioned in the comments, checking the response for a X-Content-Range header would allow you to detect buffer over runs should they occur, but this possibility is probably better avoided than fixed.

Hope this is of some benefit to you.

use strict; use warnings; use diagnostics; use LWP::UserAgent; use HTTP::Request; use HTTP::Response; use IO::Handle; my $Proxy_Username='admin'; my $Proxy_Password='adminpw'; my @Proxies = ( "http://proxy1:23651/proxy-proxy1-proxy/bin/sitemon?doit HTTP/1.0" +, "http://proxy2:23651/proxy-proxy2-proxy/bin/sitemon?doit HTTP/1.0" +, "http://proxy3:23651/proxy-proxy3-proxy/bin/sitemon?doit HTTP/1.0" +, "http://proxy4:23651/proxy-proxy4-proxy/bin/sitemon?doit HTTP/1.0" ); my @Requests; for my $URL (@Proxies) { # Better outside +the loop. my $Request = HTTP::Request->new(GET => $URL); $Request->referer("http://wizard.yellowbrick.oz"); $Request->authorization_basic($Proxy_Username,$Proxy_Password) +; push @Requests, $request; } STDERR->autoflush(1); STDOUT->autoflush(1); my $UA = LWP::UserAgent->new(); # No point re-c +reating $UA each time in the loop $UA->agent("Mozilla/4.7 [en] (WinNT; I)"); # The parameters + don't change. $UA->timeout(15); # Keep them outsi +de the loop. $UA->max_size(5000); my $delim = '--THIS_STRING_NEVER_HAPPENS'; while (sleep 60) { # Why 1==1 (jus +t 1 would do) but this is better. # No need to call time(), it's the default. # No need to name +it if you only going to print it. print scalar localtime()," --> "; # As I recently l +earnt, the scalar is important. foreach my $Request (@Requests) { # Note: named loo +p counter; not my $Request = $_ inside. my $Response = $UA->request($Request); print "something wrong happened contacting $URL....\n" and next if $Response->is_error(); # Not everyone a +grees with this syntax # everything went ok my $Content = $Response->content(); # Rather than break the data into lines and the loop over the +lines, break out the piece you want # Rather than use a regex for a static string, use index. my $first = index( $Content, $delim , 0 ) + length +$delim; my $second = index( $Content, $delim , $first ); # I'd be tempted to print out the value of $second-$first, so +that you may more closely tailor # the max-size parameter. You could also look for a header of +X-Content-Range which the user agent # adds if the the size of the buffer requested was exceeded (S +ee LWP::UserAgent docs) my $NewContent = substr( $content, $first, $second ); # Now we have a whole page in $new, we can break just this bit + up @NewContentArray=split(/\n/,$Content); # I'll have to take your word that this does what you need it +to do. # The value 14 will possibly need adjusting. if ($NewContentArray[14]=~/(>\s)(\d*)(\s<)/) { print $2," "; # get number of + active processes } } print "\n"; }

Well It's better than the Abottoire, but Yorkshire!

In reply to Re: Question concerning HTTP::Request and LWP::UserAgent by BrowserUk
in thread Question concerning HTTP::Request and LWP::UserAgent by Bjoern

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.