Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^5: Help designing a threaded service

by BrowserUk (Patriarch)
on Jan 25, 2014 at 05:32 UTC ( [id://1072036]=note: print w/replies, xml ) Need Help??


in reply to Re^4: Help designing a threaded service
in thread Help designing a threaded service

I sheepishly take it from your reply that I can't have a multi-threaded listener setup the way I see multiple forks of a fork-based network server taking connections on one port.

No. It is entirely feasible to do. Its just a completely ridiculous way to design a server.

With forks (*nix), when you have multiple processes all waiting to accept on a shared socket, when a client connects, *every* listening process receives the connect.

It is obviously a nonsense to have multiple server processes attempting to conduct concurrent communications with a single client, so now those multiple server processes need to arbitrate between themselves in order to decide which of them will pick up the phone.

Envisage an office with a shared extension and a dozen bored people all shouting "I'll get it!" at the top of their voices ... or all of them pretending not to hear it hoping someone else will be bugged out by the constant ringing and answer it before they weaken and do so.

In *nix server land the solution is for all of the listening processes to fight over acquiring a global mutex. One wins; and the other go back to whatever they were doing before they were so rudely interrupted.

Of course, in your scenario, if the client calling back to collect his output happens to randomly connect to a different process to the one that he connected to when he made his hostname/command request -- an odds on favorite scenario -- then he's sh*t outta luck; cos that process has no way of knowing that one of the other processes is gathering output for this client.

You wanna do thing the st hard way; have fun.......


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re^5: Help designing a threaded service

Replies are listed 'Best First'.
Re^6: Help designing a threaded service
by zwon (Abbot) on Jan 26, 2014 at 15:42 UTC
    With forks (*nix), when you have multiple processes all waiting to accept on a shared socket, when a client connects, *every* listening process receives the connect.
    That would be horrible, but fortunately it's not true. If multiple processes waiting for a connection on the same socket, when client connects, *only one* listening process accepts connection. Here's a simple example that demonstrates it:
    use 5.010; use strict; use warnings; use IO::Socket::INET; my $sock = IO::Socket::INET->new(LocalPort => 7777, Listen => 10); for (1..3) { my $pid = fork; unless($pid) { my $cli = $sock->accept; say "Process $$ accepted connection from " . $cli->peerport; print while <$cli>; exit 0; } }
    Try to connect to 7777 and you will see that only one process will accept connection. Hence there's no need to have any global mutexes.

      Hm. The description was based upon the implementation of nginx server.

      Which states that:

      After the main NGINX process reads the configuration file and forks into the configured number of worker processes, each worker process enters into a loop where it waits for any events on its respective set of sockets.

      Each worker process starts off with just the listening sockets, since there are no connections available yet. Therefore, the event descriptor set for each worker process starts off with just the listening sockets.

      When a connection arrives on any of the listening sockets (POP3/IMAP/SMTP), each worker process emerges from its event poll, since each NGINX worker process inherits the listening socket. Then, each NGINX worker process will attempt to acquire a global mutex. One of the worker processes will acquire the lock, whereas the others will go back to their respective event polling loops.

      Meanwhile, the worker process that acquired the global mutex will examine the triggered events, and will create necessary work queue requests for each event that was triggered. An event corresponds to a single socket descriptor from the set of descriptors that the worker was watching for events from.

      *nix isn't my world, so I'll leave it to you and others to decide if your observations or the implementation of a widely used and well tested server is correct here.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        On at least some versions of some Unix systems, multiple processes waiting on the same socket will cause all of them to be awoken but only the first one to ask will get the connection or data that triggered them to be awoken. Since nginx is setting up "necessary work queue requests" in order to handle the connection coming it, it is useful for only one process to do that. Though I'm not completely convinced that the nginx authors didn't implement this protection out of misunderstanding rather than real need.

        I believe that it is the case that you don't need to worry about this implementation detail at least in most cases.

        My vague memory of one report of this "every process wakes up" "problem" was just noting the wasted resources and that only one of the waiting processes would return from select(2) (or equivalent). I certainly don't expect more than one process to actually return from accept() when many of them are blocked inside an accept() call.

        - tye        

        Hm. The description was based upon the implementation of nginx server.
        The article you linked is factually incorrect (looks to me like a work of some intern from zimbra). Nginx workers don't fight for the accept_mutex after they got events from the listening socket, but they lock this mutex before they subscribe to events from it (see implementation). The reason is to not waste CPU, not to avoid accepting the same connection in different workers, which won't happen even if you disable this option. Anyways, nginx running event loop and OP is seems more interested in traditional prefork options (otherwise he should look at AnyEvent or Mojolicious instead of Net::Server) which simply block in accept, so it is hardly relevant.

        PS is it this guy? That's funny

Re^6: Help designing a threaded service
by Tommy (Chaplain) on Jan 26, 2014 at 20:08 UTC
    You wanna do thing the st hard way; have fun.......

    It's actually out of a desire to avoid doing things the stupid way that I posted my question to the Monastery in the first place. An abundance of problems have been pointed out by several people -- problems that are already solved by existing "wheels" on the CPAN that I'd almost certainly be better off to not reinvent.

    So my "sane" options are singular: to extend a given stable wheel (Net::Server?) via some sort of data/state sharing mechanism so that when any given listener is presented with a task ID, that it is able to retrieve the output of that task beginning from the last time it was polled. Designing and implementing that will be my biggest challenge. I still want to avoid using a database for this, but may have to fall back on that option.

    As for the stupidity, the overhead of constantly polling the service every N seconds is a necessary evil that I see no way to avoid. The aces up my sleeve are that it's not going to have to expand beyond ~20 concurrent users for the foreseeable future, and it's on a very, very fast LAN. I've already personally seen Net::Server scale well past that kind of load on lesser networks.

    I appreciate your insight BrowserUk, and that of all others who have joined in on the conversation.

     

    Tommy
    A mistake can be valuable or costly, depending on how faithfully you pursue correction
      So my "sane" options are singular: to extend a given stable wheel (Net::Server?) via some sort of data/state sharing mechanism so that when any given listener is presented with a task ID, that it is able to retrieve the output of that task beginning from the last time it was polled.

      Problem is, you've bought into the fallacy that the behemoth that is Net::Server is going to solve some high proportion of your problem. It won't.

      The very essence of your project is the problem of routing multiple discrete client connects back to the sources of their data. Handling the connects (the only bit that Net::Server will do for you) is the easy part. Plumbing the appropriate clients to their appropriate data sources is a plumbing problem that Net::Server has no solution to. And using a forking solution for that means you are going to be forced into a multiplexing nightmare.

      Which is just silly when -- by your own statement -- you only need ~20 clients, which takes just 36MB for 21 threads:

      perl -Mthreads=stack_size,4096 -MIO::Socket -E"async{ IO::Socket::INET +->new( 'localhost:' . 12300 + threads->tid ); sleep }->detach for 1 . +. 20; sleep"

      A bit more for some shared memory and all your routing problems become trivial.

      C'est la vie :)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Your response is exactly why I want to go with threading: because I can solve all my hard problems via shared memory vars, easily reconnecting a request to it's output stream. But I have to face some unhappy facts: I don't know enough about threading to know what's necessary to keep the thread memory consumption from ballooning out of control. I also don't know how to handle corner cases that I've yet to identify with the creation of a highly-available multi-threaded network service. And finally I don't yet know gracefully kill off threads that get "stuck" without resorting to a SIGKILL (which isn't exactly a showstopper, but the other issues are).

        The multi-threaded IRC chat bot code that was shared earlier in this discussion is too bare-boned to inspire confidence that extending it could handle the problems I've outlined above. If I venture down this path, I'd need a map and a guide. And frankly the latter and more essential of the two is hard to come by in *nix land where forking is king and threading has a bad reputation for, from what I can tell, all stupid reasons.

        Tommy
        A mistake can be valuable or costly, depending on how faithfully you pursue correction

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1072036]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-20 07:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found