in reply to Re^2: Allowing user to abort waitpid
in thread Allowing user to abort waitpid

nixers are a funny bunch.

Ask about how to multiplex a few hundred tcp clients transferring gobs of data, and they'll almost universally suggest using a select loop, or other polled event mechanism; which in modern high-speed comms environments requires polling with millisecond or smaller resolution to be responsive to even tens of clients. And that can consume 60% to 70% of a cpu just polling.

But ask about getting conditional input from the guy sitting at the keyboard, which requires polling no more than once every 1/10th of a second, which will consume so little cpu that it won't even show; and they call it hackish.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^4: Allowing user to abort waitpid (select)
by tye (Sage) on Mar 07, 2016 at 22:12 UTC

    select only involves polling when done incorrectly.

    And that can consume 60% to 70% of a cpu just polling.

    Yeah, when I use select, it doesn't burn CPU when waiting.

    - tye        

      Well, select (and poll) do have inherent limitations. For each call to select the kernel must check all specified descriptors to see if they are ready. Performance of that depends on the number of descriptors. Also, in a busy application select can be called very often. I think it's fair to call it 'polling'.

      With epoll and other similar mechanisms, it's only necessary to register all 'interesting' descriptors once, and then, when IO happens on some descriptor, the kernel checks if the application is interested in it. Therefore, performance is determined by the number of IO events.

      If we have a ton of descriptors, but, at any given time, IO actually happens only on some small percent of them, epoll will vastly outperform select, because, yes, it does less 'polling'.
      Yeah, when I use select, it doesn't burn CPU when waiting.
      But do you have 10k concurrent connections :)
      select only involves polling when done incorrectly.

      Okay. So the provision of the timeout parameter is just an otherwise redundant trap for the unwary.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re^4: Allowing user to abort waitpid
by Anonymous Monk on Mar 07, 2016 at 20:04 UTC
    Not a good example, select and poll are indeed slow and not recommended for "a few hundren tcp clients"; use epoll/kqueue/signal-driven IO (well, maybe not signal-driven IO).
      use epoll/kqueue/signal-driven IO

      There are several problems with that advice:

      • They're not available from Perl.
      • They'e not POSIX.

        I agree, POSIX is increasingly irrelevant as it isn't a useful representation of *nix any more. (Or for the last two decades for that matter.)

      • epoll doesn't work with timers, signals, semaphores, processes, network devices. Not even disk files.

        Yes, there are the calls signalfd(), eventfd(), and timerfd_create(); but they don't help much.

      • epoll maybe O(1) for event notifications, but it remains O(n) for modifications to the interest set.

        Fast changing, dynamic descriptor sets require many and frequent calls into the kernel for epoll_ctl(), which gets expensive.

      • kqueue addresses most of that, but retains one major flaw in modern systems: Per process interest sets.

        Basically, it hasn't caught up with the advent of the multi-core architectures that are now ubiquitous; which means you can't easily split your loads across those multiple cores by using multiple thread running separate event loops.

        Yes, you can use multiple processes; but that creates the problem of coordination and data sharing between those processes.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Hey, BrowersUK, some of your objections are just bizarre.
        They're not available from Perl.
        What
        "epoll doesn't work with timers, signals... yes, there are the calls signalfd(), and timerfd_create()... they don't help much"
        ??? BTW, none of the IO multiplexers (to my knowledge) do anything useful with hard disk files.
        "epoll maybe O(1) for event notifications, but it remains O(n) for modifications to the interest set"
        you found that a problem in practice? Tell me more...
        fast changing, dynamic descriptor sets require many and calls into the kernel for epoll_ctl(), which gets expensive.
        "fast" compared to what? compared to the speed of CPUs, or compared to the speed of IO that you're presumably doing on these descriptors in between opening and closing them?
        kqueue addresses most of that, but retains one major flaw in modern systems: Per process interest sets.
        Now, I haven't actually used kqueue (since that's a BSD thing). But you said "retains", and that's wrong: you can create as many epoll instances (sets) as you like, and monitor them separately (using threads, if you like). Actually, kqueue man page says:
        The kqueue() system call fails if: [ENOMEM] The kernel failed to allocate enough memory for the k +ernel queue. [EMFILE] The per-process descriptor table is full. [ENFILE] The system file table is full.
        Where does it say "you already have one queue in this process and can't get more"?
        Basically, it hasn't caught up with the advent of the multi-core architectures that are now ubiquitous; which means you can't easily split your loads across those multiple cores by using multiple thread running separate event loops.
        Sure you can, if that's what you want to do, for some reason...

        Really, your points look pretty strange to me, except maybe for adding more than one fd at a time to epoll instance. Well, that looks like it wouldn't be difficult to implement, but it wasn't, so I guess other people didn't feel the need for it either.