in reply to Re: Threads and HTTPS Segfault
in thread Threads and HTTPS Segfault

Coro doesn't scale across multiple processors and that would hurt performance, but I suppose I could move the HTTPS request from the child threads into a Coro thread in the parent, but I'd need some way to send data from the children to the Coro thread.

Replies are listed 'Best First'.
Re^3: Threads and HTTPS Segfault
by BrowserUk (Patriarch) on Aug 21, 2011 at 20:05 UTC

    If you cannot fix the problem and need bi-di communications with the kids, then you could do worse than to use a queue talking to threads that run lwp-requests via piped-opens. The https sessions run in separate processes, but you fetch the data back into the parent process where you can coordinate between them.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      I tried this (using open and curl) and performance took too big a hit. The cpu was doing about 15% user and 85% system and I could only get about 40 req/s - I need to be around 70 to 100, which I am able to do when running many disconnected non-threaded scripts.

      Starting up a process for each request unfortunately is too expensive, thank you for the suggestion, though.

        Starting up a process for each request unfortunately is too expensive

        Then I would write one script that takes URLS from stdin and does the fetches.

        #! perl -slw use strict; use LWP::Simple; while( <> ) { chomp; my $content = get $_; ### do something with it. }

        And then drive multiple copies of that script from a threaded script:

        #! perl -slw use strict; use threads; use Thread::Queue; sub worker { my $Q = shift; open my $pipe, '-|', q[ perl theOtherScript.pl] or die $!; while( my $url = $Q->dequeue() ) { print $pipe $url; } } our $THREADS //= 4; my $Q = new Thread::Queue; my @workers = map threads->create( \&worker, $Q ), 1 .. $THREADS; $Q->enqueue( $url ) while ...; ## fetch urls from somewhere and Q them $Q->enqueue( (undef) x $THREADS ); $_->join for @workers;

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: Threads and HTTPS Segfault
by juster (Friar) on Aug 21, 2011 at 20:41 UTC

    You say in your original post, that...

    Most of the time in each thread is spent making the HTTPS request ...

    If this is true you shouldn't lose much performance by using an event framework and/or Coro... depending on what you mean by "most" I guess...

      Is that true for https connections?

      I thought SSL connections imposed a fairly high cpu loaded because of the decryption requirements. Even on a pretty low bandwidth connection, cpu usage can quickly become the limiting factor for throughput. Start trying to decode multiple concurrent streams on the same processor (as with Coro, POE and other event driven architectures), and cpu will definitely become the limit factor to throughput.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        I will have to respectfully disagree with you BrowserUk. Delays when using SSL/TLS with modern hardware are hardly noticeable nowadays. Your network latency would have to be incredibly low in order for the decryption to take longer than fetching the data. The idea I had is that the single processor could be decrypting data while it is waiting for data from the network, by using an event framework. As long as network latency is greater, hopefully much greater, than the amount of time decryption takes, performance should not suffer.

        From my own experimenting, a prohibitive delay when using HTTPS comes from the handshake that begins the encrypted connection. The client and server exchange messages back and forth, each subject to network latency! Taking advantage of persistent HTTP 1.1 connections is practically a necessity.

        I made a benchmark to check my theory. The script is admittedly funky and limited. The experiment uses EV, AnyEvent, and AnyEvent::HTTP to see if CPU would be a limited factor, based upon the idea that switching between http and https would show noticeable differences. Relearning AnyEvent took me awhile and I wasted alot of time on this today but maybe someone will find it useful for stress-testing or experimenting.