in reply to Sockets: TMTOWTDI (BWWIB)?

The question of 'which server model should I use' is a very complicated one, and I doubt you will find a consistent answer from different people for 'which way is best?'

For the most basic overview:

  1. One process per connection: Medium capacity. Simple to implement and maintain. On UNIX platforms, data can be shared between clients in a restricted manner using shared memory segments. Another alternative would be a database backend. Active connection limit restricted by maximum number of process that can be executed on system at one time. If one process dies due to an implementation error, other process remains active and unaffected.
  2. One thread per connection: Medium->Large capacity. Data is easily shared between clients. Active connection limited restricted by maximum number of threads that can be executed on system at one time. If Perl ITHREADS are used, special consideration should be taken to understand the cost of creating a thread (Perl data structure copying, etc.). Cost can be minimized by utilizing a thread pool, instead of creating a new thread for each connection.
  3. One process, switched connections:Medium->Large capacity if implemented correctly. Correct implementation is difficult as each connection must be managed by a state machine-like device, event handlers need to be broken into small pieces, and some events need to be prioritized. Data is easily shared between clients. Implementation errors have the greatest chance of affecting all active connections when compared to any of the other models.
  4. Multiple processes, connection loop:This is the model used by the Apache web server. The principle is that multiple processes actually wait on accept(), and the first one to succeed, gets the connection, and handles the connection completed. An outer processes ensures that sufficient processes are waiting on the listening socket. This model is very resistant to most implementation errors, and is very efficient. Where other models require multiple system calls to handle a connection (accept()/fork(), accept()/pthread_create(), select()/accept()), this model requires only one system call: accept(). Client data must be accessed as per the 'One process per connection' model.

In terms of portability, all models should function to different degrees. My personal attachment is the 'One process, switched connections' model, however, due to its complexity, I would not recommend it to those personally unfamiliar with the issues involved. If you have not ever used non-blocking I/O, I recommend that the 'One process, switched connections' model not be considered.

This description is far from complete, but it should provide you with rough expectations. Notice that I did not label any of these solutions 'Large capacity'. Any implementation that truly needs 'Large capacity' should not be written in Perl. By 'Large capacity', I mean 10000+ connections per minute sustained.

WIN32 NOTE: Take into account that fork() is implemented using Perl ITHREADS under WIN32. Therefore, although all multiple process models should function under WIN32, it is probably better to consider implementing the solutions using Perl threads to avoid the 'emulation' layer.

Replies are listed 'Best First'.
Re^2: Sockets: TMTOWTDI (BWWIB)?
by Ionizor (Pilgrim) on Dec 17, 2002 at 05:26 UTC

    Regarding familiarity with the issues involved in the switched connections model, do you think the books that were recommended would be a sufficient teacher? At this point, the switched connections model is the most appealing to me.

    The server won't be handling 10000+ connections. There's a way to scare me to wakefulness. Heh.

      I am positive that the books mentioned will be useful. Many people have recommended them to me over the years.

      For myself, I found reading books or articles that describe the concepts to be useful, and a good head start, but I found the actual experience of determining the problems in my own implementations to be more valuable in the long run. If you have the time, try implementing both, and then optimizing each to the best of your ability. Subject your code to a peer review either here, or within another Perl community, or people who work with you. You won't lose from the experience.

      If you don't have the time, I would still recommend either one of the process models, or the thread models, over switched connections. Since you are going to be using WIN32, the thread model is probably best.

      It took three generations to get my current pure-perl event loop and socket management code to the level it is now. With standard Intel hardware of yesteryear (400-800Mhz, single CPU), it is able to handle 1000+ active connections in 2 seconds. These three generations represent several months of work (at least a few weeks of solid work mixed with odd complaints regarding production environment behaviour or misbehaviour). This is why I recommend against it. If you still want to pursue this path, you may cut some corners by using an existing event loop such as the one used by the Tk module.

      I apologize, but I am not able to release the code I speak of at the current point in time. It is owned by my employer, and all that... I am more than happy to comment on code that you submit, though.

        I figured the answer was something along those lines - the books will help but they can't save me from problems in my implementation.

        I have plenty of time for this so I'm going to persue the select() method for now. This is my own personal pet project so there are no deadlines and no production environment concerns. At the very least it will crash less than the MUD codebase I've tinkered with in the past - it once segfaulted three times in a row before starting up and I hadn't even changed anything yet. Corner cutting goes against the grain for me so I think I'll just write it from scratch. Perfectionism strikes again.

        I figured the best way to learn was to hold my breath and jump in with both feet. This approach has worked for me so far, so why not now? :)

        No need to apologise for not releasing code - I certainly understand and the experience will be good for me anyway. Actually, given that you could release the code, I'm not sure I would look at it anyway - this program is as much about learning the ins and outs of network programming as it is about creating an interesting / useful project.