highest performance http load test architecture in perl?

checker has asked for the wisdom of the Perl Monks concerning the following question:

This has been solved! I updated the post at the bottom!

Hi, for basic http load testing, I usually use Siege, which is a small and simple C program that will easily saturate the 100mbps link from my test machine to my server. I need to script some more stateful custom load tests now, so I started exploring how to do this in perl. I'm currently just trying to reproduce the simple "fetch a bunch of urls" load test to see how it performs relative to Siege, before adding the more complex stateful stuff.

However, I'm having trouble getting the performance where I want it. Like I said, Siege will saturate the 100mbps link with 25% cpu utilization, but I can't get a perl script to do any better than 34mbps, and I've tried a bunch of variations. I don't care if it uses 100% cpu, I just need it to be able to saturate the link.

Here are some various perl techniques that I've tried:

# AE + EV + AnyEvent::Http
# this is the best one so far, it'll do 34mbps
# it's "only" using 80% cpu
use strict;
use warnings;
use EV;
use AnyEvent;
use AnyEvent::HTTP;

sub response {
    my ($body,$headers) = @_;
    my $size = length $body;
    my $url = $headers->{URL};
    print "finished $url, $size bytes\n";
}

my @urls = <DATA>;

while(1) {
    my $url = $urls[int(rand(int(@urls)))];
    chomp $url;
    print "start: $url\n";
    http_get $url, \&response;
    EV::loop EV::LOOP_NONBLOCK;
}

__DATA__
some urls...the same ones that the Siege test used
[download]

# HTTP::Async
# this one will do 20mbps, fastest invokation for me was 
# perl -w httpasync.pl 0 15
use strict;
use warnings;
use threads;
use threads::shared;
use HTTP::Async;
use HTTP::Request;
use HTTP::Response;

my @urls : shared = <DATA>;
my $numchildren = $ARGV[0];
my $numthreads = $ARGV[1];
print "num children: $numchildren, threads: $numthreads\n";
my $pid = 0;
for(my $i = 0; $i < $numchildren; ++$i) {
    $pid = fork();
    if($pid == 0) {
        last;
    } else {
        print "forked $pid\n";
    }
}
$pid = $$;
print "pid: $pid\n";

sub fetch {
    my $tid = shift;
    my $async = HTTP::Async->new;
    $async->slots(int(@urls));
    my %idurl = ();

    while(1) {
        if($async->total_count < $async->slots) {
            my $url = $urls[int(rand(int(@urls)))];
            chomp $url;
            print "start $pid/$tid: $url\n";
            my $id = $async->add(HTTP::Request->new(GET=>$url));
            $idurl{$id} = $url;
        }
        if($async->not_empty) {
            my ($resp, $id) = $async->next_response();
            if($resp) {
                my $size = length $resp->content;
                my $url = $idurl{$id};
                print "finish $pid/$tid: $url, $size bytes\n";
                delete $idurl{$id};
            }
        }
    }
}

my @threads;
for(my $i = 0; $i < $numthreads; ++$i) {
    my $thread = threads->create(\&fetch,$i);
    push(@threads, $thread);
}
foreach (@threads) {
    $_->join();
}

__DATA__
some urls...
[download]

I also tried threads + LWP::UserAgent, LWP::Parallel, and various combinations of those with fork. All the LWP techniques would max at about 1mbps, so they weren't even close to the above two. This post is long already, but let me know if you want those snippets.

I'm surprised, since I figured it would be trivial to saturate 100mbps in perl, since io is so much slower than cpu these days so the difference between C and perl would not matter, but I'm definitely stuggling to get there. Anybody have any advice, or see any obvious problems with the above programs?

Thanks,
Chris

Okay, based on the comments, we have a winner. I did two more tests, one of AnyEvent::Curl::Multi, which got to 33mbps, and one of threaded/forked Furl::HTTP (the low level interface), which finally got to 95mbps. Here's the code for reference:

# AnyEvent::Curl::Multi, did about 33mbps
use strict;
use warnings;
use EV;
use AnyEvent;
use AnyEvent::Curl::Multi;
use HTTP::Request;

my $quit = 0;
sub ctrlc { $quit = 1; }
$SIG{INT} = \&ctrlc;

my @urls = <DATA>;

my $client = AnyEvent::Curl::Multi->new;
$client->max_concurrency(100);

$client->reg_cb(response => sub {
    my ($client, $request, $response, $stats) = @_;
    my $url = $request->uri;
    my $size = length $response->content;
    print "finish: $url, $size bytes\n";
});
$client->reg_cb(error => sub {
    my ($client, $request, $errmsg, $stats) = @_;
    # ...
});


while(!$quit) {
    my $url = $urls[int(rand(int(@urls)))];
    chomp $url;
    print "start: $url\n";
    my $request = HTTP::Request->new('GET',$url);
    $client->request($request);
    EV::loop EV::LOOP_NONBLOCK;
}
__DATA__
urls...
[download]

And, the winner:

# winner!  
# threads + Furl::HTTP with 10 threads -> 95mbps at 55% cpu
# I also tested fork, which was the same, 
# but threads seemed cleaner
use strict;
use warnings;
use threads;
use threads::shared;
use Furl;
use Perl::Unsafe::Signals;

my $quit : shared = 0;
sub ctrlc { $quit = 1; }
$SIG{INT} = \&ctrlc;

my @urls : shared = <DATA>;

my $numthreads = $ARGV[0];
print "num threads: $numthreads, pid: $$\n";

sub fetch {
    my $tid = shift;
    my $furl = Furl::HTTP->new(agent => 'Furl/0.31',timeout => 10);

    while(!$quit) {
        my $url = $urls[int(rand(int(@urls)))];
        chomp $url;
        #print "start $tid: $url\n";
        my ($ver, $code, $msg, $headers, $body) = $furl->get($url);
        my $size = length $body;
        #print "finish $tid: $url, $size bytes\n";
    }
}

my @threads;
for(my $i = 0; $i < $numthreads; ++$i) {
    my $thread = threads->create(\&fetch,$i);
    push(@threads, $thread);
}
UNSAFE_SIGNALS {
    foreach (@threads) {
        $_->join();
    }
}

__DATA__
urls...
[download]

Thanks to the Anonymous Monk below for the suggestions!

Comment on highest performance http load test architecture in perl? Select or Download Code

Replies are listed 'Best First'.
Re: highest performance http load test architecture in perl? by BrowserUk (Patriarch) on Apr 27, 2011 at 22:03 UTC
What performance increase do you get if you change this line in your second program: `my @urls : shared = <DATA>;` [download] to `my @urls = <DATA>;` [download] Restrict the number of concurrent threads to the same number cores you have? Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re: highest performance http load test architecture in perl? by Anonymous Monk on Apr 27, 2011 at 21:39 UTC
You might also want to try AnyEvent::Curl::Multi and fork+Furl	[reply]
Re^2: highest performance http load test architecture in perl? by flexvault (Monsignor) on Apr 27, 2011 at 22:38 UTC
We wanted to do something similar (but different). We wanted to stress test a dynamic site and see how many transactions per second per core we could achieve. These tests were for Linux/Unix only, so I'm not sure how to do this with Windows. We were able to saturate a 100Mb ethernet interface with "use HTTP:Lite;", but this didn't stress our target system. We then used a perl loop to call 'system "wget . . . /&";' with the urls, and we were able to saturate a 1,000Mb ethernet interface and also stress test the application. All that this does is allow each "wget" to run in it's own address space and to be dispatched by the operating system independent of perl (fork may solve your requirements also). We used 16 urls and called them 1,000 times from the perl script. In our case the url was the same, and the parameters send to the application were different, but the results allowed us to stress test the application. I don't think your problem is perl, but more that "wget" or "Siege" was optimized to get the url data using all the "bells and whistles" of the operating system. Good Luck. "Well done is better than well said." - Benjamin Franklin	[reply]
Re^2: highest performance http load test architecture in perl? by checker (Novice) on Apr 28, 2011 at 01:53 UTC
Furl::HTTP (the low level interface to Furl that doesn't create any of the HTTP::* objects) wins with fork/threads. Thnks for the suggestions, and I updated the main post with the code!	[reply]
Re: highest performance http load test architecture in perl? by Anonymous Monk on Aug 28, 2015 at 23:16 UTC
This: `my @threads; for(my $i = 0; $i < $numthreads; ++$i) { my $thread = threads->create(\&fetch,$i); push(@threads, $thread); }` [download] ..can be easily shorten to true perl'ish: `my @threads = map { threads->create \&fetch, $_ } 0 .. $numthreads-1;` [download]	[reply] [d/l] [select]