Perl threads to open 200 http connections

robrt has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Perl threads to open 200 http connections by BrowserUk (Patriarch) on Aug 02, 2010 at 07:24 UTC
Change `use threads;` to `use threads ( stack_size => 4096 );` See Use more threads. for the explanation. You might also consider using LWP::Simple`::getstore( $url, $file )`.	[reply] [d/l] [select]
Re^2: Perl threads to open 200 http connections by robrt (Novice) on Aug 03, 2010 at 10:59 UTC
Thanks! I changed the stack size to 4096, and now around 185 connections are getting established. I tried to lower the stack size to 1024 hoping to see the connections rise but it didn't happen. I tried to implement LWP::Simple but I noticed that it copies only the website matter(visible content on site), not the actual 10MB files. So, when I tried copy the files using the below code, this script is failing after 30 threads itself. `use strict; use LWP::Simple; my $path = "http://10.2.1.23/http-path/payload/10MB/"; for (my $i=0;$i<200;$i++) { my $filename = "File-10M-".$i.".txt"; my $url = $path.$filename; my $file = "D:\\kshare\\payload\\$filename"; $thread = threads->new(\&httpcon, $url, $file); } $thread->join; sub httpcon { my $url = shift; my $file = shift; is_success(getstore($url, $file)) or die "$!\n"; }` [download] My aim is to fill the network pipe almost to 400 MB, but opening around 180 files is filling only around 70MB. Can anyone suggest me better way to fill the network bandwidth please.	[reply] [d/l]
Re^3: Perl threads to open 200 http connections by BrowserUk (Patriarch) on Aug 03, 2010 at 11:27 UTC
The first thing I notice is that there are things wrong with the code you've posted. You have `use strict;` but `$thread` is never declared? Also, you are attempting to start 200 threads, but only waiting for one to complete. Try this: `#! perl -slw use strict; use threads ( stack_size => 4096 ); use threads::shared; use LWP::Simple; use Time::HiRes qw[ time sleep ]; our $T \|\|= 200; my $url = ### your url (of the actual file!) here ###; my $running :shared = 0; my $start = time; for( 1 .. $T ) { async( sub{ { lock $running; ++$running }; sleep 0.001 while $running < $T; my $id = shift; getstore( $url, qq[c:/test/dl.t.$id] ); --$running; }, $_ )->detach; } sleep 1 while $running; printf "Took %.3f seconds\n", time() - $start;` [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP an inspiration; A true Folk's Guy	[reply] [d/l] [select]
Re^4: Perl threads to open 200 http connections by robrt (Novice) on Aug 04, 2010 at 13:07 UTC
Re^5: Perl threads to open 200 http connections by ikegami (Patriarch) on Aug 04, 2010 at 16:19 UTC
Re^5: Perl threads to open 200 http connections by BrowserUk (Patriarch) on Aug 04, 2010 at 16:28 UTC
Some notes below your chosen depth have not been shown here
Re: Perl threads to open 200 http connections by sundialsvc4 (Abbot) on Aug 02, 2010 at 16:19 UTC
I would consider this “stress testing” scenario to be ill-advised and misleading. If you attempt to launch hundreds of parallel threads, and unless each of them is really doing exactly what you are doing in production, the only thing that you are going to be “measuring” is the poor design of the test. First of all, you already know how big The Pipe is. You know how many megabits or gigabits per second it can take. Ballpark overhead to be in the area of 25% and figure that you can probably move the rest through the pipe as data ... assuming zero “hops.” Next, you can determine how many simultaneous transfers the computer can handle, by working systematically upward in small increments until you see the times begin to degrade exponentially. This is the “elbow-shaped curve” that always exists; the so-called “thrash point.” Again as a rule of thumb, step back 25% from that and call it good. The next part of your exploration should involve stochastic (statistical) modeling, which may or may not involve Perl. (There are packages for the open-source analytics system, “R,” which are specifically designed for this. See http://cran.r-project.org/ and search for “stochastic” or “simulation.”) (Heh... if you thought Perl was “engaging, addicting and fun” ...) `:-D` You know that the request volume may at times exceed the number of worker-threads that are processing requests. (An inbound request queue is, or should be, a basic part of the design.) Therefore, you are interested to know what are the completion times of the requests, given that this time will include both processing time, I/O time, and time spent in the queue(s). It is most useful to approach this by estabishing goals, then measuring the system’s sustained ability to meet those goals. For instance, you might stipulate that “95% of all requests must be serviced and returned to the client within 1.0 seconds.” And you might stipulate that “the standard deviation of request times, which exceed the 1.0 second rule, must not exceed 2.00.” Then you model the system and see if it passes or fails. If it consistently fails, then you start looking for bottlenecks. Finally, always remember that what you are seeking to do here is “a thing that has already been done, countless times before.” Take the time to thoroughly study prior art, and documented methods, before you start writing Perl (or any other) code. I would predict with some certainty that you can, in fact, model the behavior of this system without writing any Perl code at all!
Re: Perl threads to open 200 http connections by bluescreen (Friar) on Aug 02, 2010 at 13:10 UTC
You're missing the `$thread->join` right after the for statement. That's mandatory otherwise the program exit without waiting threads to finish. In this case you're creating one thread + one proc for each wget session. One approach I'd try is to use Coro::LWP or AnyEvent::HTTP for non-blocking I/O, that should scale up better than the threaded approach	[reply] [d/l]
Re^2: Perl threads to open 200 http connections by BrowserUk (Patriarch) on Aug 02, 2010 at 15:28 UTC
One approach I'd try is to use Coro::LWP I'd like to try that. Perhaps you could post a sample?	[reply]
Re^2: Perl threads to open 200 http connections by BrowserUk (Patriarch) on Aug 03, 2010 at 10:31 UTC
That's what I thought. I can't get the examples to work either.	[reply]
Re: Perl threads to open 200 http connections by afoken (Chancellor) on Aug 02, 2010 at 15:49 UTC
You are creating 200 new instances of cmd.exe, they create 200 new instances of wget. I think you run out of memory or other resources. Also, hammering 200 connections against a single server is not very friendly. Unless you have a very good network connection, this will most likely saturate your network connection. forking breaks No, forking is not implemented. Perl on Windows has a pseudo-fork giving you a new interpreter thread for each fork. I think your process runs out of (interpreter) threads. I don't know how exactly exec() is implemented on Windows, but since the Windows API doesn't have exec(), it must be emulated using `CreateProcess()` and some code that waits for the spawned process to exit. I would like to see how "forking breaks". Show the code and the error from `$!`. The funny thing here is that you don't need more than a single process with a single thread to start 200 instances of wget. Just create all those processes in a loop, making sure not to wait for them until you need to. On Unix, you would just fork them and remember the PIDs, then wait until you saw all child processes exit by handling SIGCHLD. on Windows, you would use `system(1,...)` if you don't care about exiting before your children do, or `Win32::Process::Create()` (from Win32::Process) instead of fork, and poll `Win32::Process::Wait()` instead of handling SIGCHLD. Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply] [d/l] [select]
Re^2: Perl threads to open 200 http connections by robrt (Novice) on Aug 03, 2010 at 11:11 UTC
Hi Alex, after the 63 connections forking stops and following is the error. -- Cant Fork: Resource temporarily unavailable. Below is the code - `for (my $i=0;$i<200;$i++) { my $filename = "File-10M-".$i.".txt"; FORK: { if ($pid=fork) { next; } elsif (defined $pid) { # system("wget --output-document D:\\kshare\\payload\\$filena +me http://10.2.1.23/http-path/payload/10MB/$filename >nul"); print "$filename\n"; exit; } elsif ($! eq "EAGAIN"){ redo FORK; } else { print "Cant Fork: $!"; exit 2; } } }` [download]	[reply] [d/l]
Re^3: Perl threads to open 200 http connections by BrowserUk (Patriarch) on Aug 03, 2010 at 11:31 UTC
The problem is the way wait is emulated. It uses a windows API that has a limit of 64 semaphores. Hence you cannot start another process until one of the existing 63 has completed. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP an inspiration; A true Folk's Guy	[reply]
Re: Perl threads to open 200 http connections by merlyn (Sage) on Aug 03, 2010 at 18:34 UTC
Why are you trying to reinvent ab? -- Randal L. Schwartz, Perl hacker The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.	[reply]


Keep It Simple, Stupid
	PerlMonks