Well that's pretty bizarre! So for some reason, attempting to read an incomplete SSL packet takes 3x as long as reading and decoding a packet? Your results point heavily toward a bad design inside libssl itself.

Silly me, sorry I guess I was really tired yesterday... I made a stupid copy-paste mistake so that the "Avg productive sysread time" and "Avg unproductive sysread time" values were wrong for the non-blocking SSL samples. I fixed them... Though it doesn't change the fact that the unproductive sysreads take more time than they should and are responsible for the high CPU usage in my opinion.

The first SSL sample of the three I posted is interesting because there is about the same number of sysread calls by second as the non-SSL sample. It means that during this sample there was close to one (unproductive) sysread for each TCP packet received, as in the non-SSL case. This allows us to compare the time of the unproductive SSL sysread calls to that of the non-SSL sysread calls: 6.33 us versus 2.73 us, more than twice as long for a one-TCP-packet read.


Just to be sure though, what if you make a quick change to your SSLeay.xs to verify that the problem is in SSL_read...

I added your debug traces and I did let run my script for about one minute and a half, redirecting stderr to a file.

First, of course it impacted a bit the metrics produced by my script. For example I no longer have samples with very high sysread call rate and productive sysread ratio < 10%. Here is an example of sample (the most common one) produced by my script when using your patched SSLeay binary:

Interval: 1.00s Avg transfer speed: 117.55 MB/s Nb sysread calls: 52362 Productive sysread size: 16384 B Productive sysread ratio: 13.70% Avg productive sysread time: 21.68 us Avg unproductive sysread time: 13.37 us
(it looks like the second sample I posted previously, with similar sysread call rate and productive sysread ratio, except the timings are a bit higher, which makes sense I guess)

And here is a sample of the output produced by your debug traces:

SSL_read(s, buf, 16384) => 16384 (9 us) SSL_read(s, buf, 16384) => -1 (7 us) SSL_read(s, buf, 16384) => -1 (1 us) SSL_read(s, buf, 16384) => -1 (1 us) SSL_read(s, buf, 16384) => -1 (2 us) SSL_read(s, buf, 16384) => -1 (1 us) SSL_read(s, buf, 16384) => -1 (1 us) SSL_read(s, buf, 16384) => 16384 (10 us) SSL_read(s, buf, 16384) => -1 (6 us) SSL_read(s, buf, 16384) => -1 (2 us) SSL_read(s, buf, 16384) => -1 (1 us) SSL_read(s, buf, 16384) => -1 (1 us) SSL_read(s, buf, 16384) => -1 (5 us) SSL_read(s, buf, 16384) => -1 (6 us) SSL_read(s, buf, 16384) => -1 (1 us) SSL_read(s, buf, 16384) => 16384 (12 us) SSL_read(s, buf, 16384) => -1 (5 us) SSL_read(s, buf, 16384) => -1 (6 us) SSL_read(s, buf, 16384) => -1 (2 us) SSL_read(s, buf, 16384) => -1 (2 us) SSL_read(s, buf, 16384) => -1 (5 us) SSL_read(s, buf, 16384) => 16384 (12 us) SSL_read(s, buf, 16384) => -1 (6 us) SSL_read(s, buf, 16384) => -1 (3 us) SSL_read(s, buf, 16384) => -1 (1 us) SSL_read(s, buf, 16384) => -1 (1 us) SSL_read(s, buf, 16384) => -1 (5 us) SSL_read(s, buf, 16384) => -1 (1 us) SSL_read(s, buf, 16384) => 16384 (12 us) SSL_read(s, buf, 16384) => -1 (6 us) SSL_read(s, buf, 16384) => -1 (2 us) SSL_read(s, buf, 16384) => -1 (2 us) SSL_read(s, buf, 16384) => -1 (2 us) SSL_read(s, buf, 16384) => -1 (1 us) SSL_read(s, buf, 16384) => -1 (1 us)

We can see there is about 6 unproductive reads for 1 productive read at the time the sample was taken. This is the most common pattern from my observation. I found some extreme cases like these ones, but they are very rare:

SSL_read(s, buf, 16384) => 16384 (12 us) SSL_read(s, buf, 16384) => -1 (5 us) SSL_read(s, buf, 16384) => -1 (92 us) SSL_read(s, buf, 16384) => 16384 (14 us) ... SSL_read(s, buf, 16384) => 16384 (9 us) SSL_read(s, buf, 16384) => -1 (5 us) SSL_read(s, buf, 16384) => -1 (85 us) SSL_read(s, buf, 16384) => 16384 (13 us) ... SSL_read(s, buf, 16384) => 16384 (18 us) SSL_read(s, buf, 16384) => -1 (75 us) SSL_read(s, buf, 16384) => 16384 (24 us)

Some poor man's metrics regarding the recorded data:

$ grep -c "=> 16384" debug_ssl2.txt 613635 $ grep -c "=> \-1" debug_ssl2.txt 3779378
61365 productive reads
3779378 unproductive reads
=> ~13.97% productive sysread ratio

That is consistent with what my script is reporting (phew!)

What is NOT consistent though, is the timing for the sysread, which would confirm the problem doesn't come from libssl itself...


In reply to Re^6: IO::Socket::SSL / Net::SSLeay inefficient in non-blocking mode ? by Yaribz
in thread IO::Socket::SSL / Net::SSLeay inefficient in non-blocking mode ? by Yaribz

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.