Folks, I'm looking for suggestions on how I might improve the efficiency of a program I use which does non-blocking HTTP io with often 1000+ open sockets. The central action of the program is characterized by the simplified code snipit which follows later. My thanks to liverpole for reminding me of the Perl module IO::Select which, although I had previously used, did not in this code which I inherited from the original code author. I suspect that much time is taken by checking on socket availability too often. I hope that there is a method to limit my calls to IO::Select:can_read so they are only done only when there is pending IO. I have been unsuccessful in finding such a mechanism. Is there any mechanism to implement the following pseudocode more efficiently than just calling IO::Select's can_read() every time one needs to check if any socket io is pending?
$SIG{INTERUPT_ON_PENDING_SOCKET_IO} = \&ckSockets;
Another optimization possibility

Even though we may have 1000+ open sockets the activity at any one time is sparse. I've been speculating about going back to the bit vector version of select and looking at the 1000+ bit length vector 32 at a time. I'm not optimistic about this approach. For all I know, the implementer of IO::Select may already do this. Highly simplified version of my current code

use IO::Select; ... # Check for and process any pending socket input # avoid steping on toes by keeping running list of ready # sockets and process it untill empty sub ckSockets { # Returns: # of ready so we can tell activit +y ... my @breadys = $io_select_obj->can_read(0); foreach my $fd_key (@breadys) { ...read and process data from this socket... }
Looking at the top of profiled run below we see that the time appears dominated by calls to the socket testing:
[root@ibm-blade-blade0 testbuddy]# time dprofpp Total Elapsed Time = 1790.060 Seconds User+System Time = 1315.990 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 55.9 736.8 753.00 317833 0.0002 0.0002 IO::Select::can_read 9.05 119.1 256.27 363021 0.0000 0.0001 BuddyUsers::log 5.25 69.14 137.12 350999 0.0000 0.0000 tsprint::ts 5.17 67.97 67.979 350999 0.0000 0.0000 POSIX::strftime
Update:  Updated to correct spelling on IO::Select, add '0' to can_read to reflect actual code.

In reply to Socket IO with large (>1000) numbers of open sockets by Ray Smith

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.