Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re^3: How to maintain a persistent connection over sockets?

by BrowserUk (Patriarch)
on May 02, 2012 at 18:15 UTC ( #968501=note: print w/replies, xml ) Need Help??

in reply to Re^2: How to maintain a persistent connection over sockets?
in thread How to maintain a persistent connection over sockets?

I use IO::Socket::UNIX for both the client and server.

That means I am probably the wrong person to be advising you. I don't use Unix domain sockets -- my platform doesn't have them -- and I know very little about them.

One thought that crosses my mind that I'll say out loud without doing any further research: As far as I'm aware, unix domain sockets only work within the local box; as such they are not so different from named pipes (fifos);

On windows, named pipes also work with the local server domain and are substantially faster than (INET) sockets.

I typed this code in, so I doubt it will compile ...

Unfortunately, that means it doesn't tell us anything useful.

If you posted snippets of the real code, we might be able to spot implementation errors; even if they were not complete and runnable; but uncompilable random snippets tell us nothing.

Here are some results: ...

I assume, though you haven't stated it, that your concern is the lower throughput rate for inter-process comms versus intra-process comms.

I afraid that is to be expected.

The latter is probably implemented as a simple memory buffer within the single processes address space. As such the api's have no need to perform ring-3/ring-0/ring-3 (user space to kernel space) transitions.

The former will be built around a process-shared memory buffer and will involve those user/kernel space transitions for every api call. In addition, they will also require kernel level locks; and may well invoke waiting for multiple task-switches until the appropriate other process -- the client or server process that is the communications partner -- to get a kernel timeslice.

In short:inter-process comms will always carry substantially greater overheads that intra-process (inter-thread) comms. That is inevitable.

So, if I could keep a persistent connection between the server and the client then performance may be better.

There is nothing in the text of your description; nor the (possibly unrepresentative) code snippets you've posted that suggests you aren't already using "persistant connections". How/why do you reach the conclusion that you are using transient connections?

In essence, what I'm suggesting here is that you are wrongly attributing "slow comms" to "a lack of persistent connections", without providing any evidence for the latter.

I think this is one of those (relatively rare) occasions where the X/Y problem raises its head. That is, you are describing what you've wrongly concluded is the source of your problem and are asking how to fix that; rather than describing the actual symptoms that are telling you that you have a problem, and allowing us to access the possible causes.

From your previous posts and questions I have an idea of the project you are working on; and I know you are breaking new ground in many areas concurrently. Perhaps you need to step back a little and describe the actual requirements of this part of your project and ask a more general question of teh best approach to satisfying them?

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

  • Comment on Re^3: How to maintain a persistent connection over sockets?

Replies are listed 'Best First'.
Re^4: How to maintain a persistent connection over sockets?
by flexvault (Monsignor) on May 04, 2012 at 17:31 UTC


    I took a while to respond. Here is working code showing what I'm trying too improve. ( I looked at the named pipes, but that seemed to imply a parent/child situation.) I used the plain socket code shown below to describe the situation. On the system I'm testing, the cost of building the socket each time more than doubles the total time of the transaction.

      ...actual requirements of this part of your project and ask a more general question of teh best approach to satisfying them?

      As usual you're right about that! My current project is to build a pure-perl NoSQL key/value database to replace Oracle's BerkeleyDB. Caching is critical to database performance, but I haven't found a 'solid' way to share the cache between processes. That was the performance numbers I showed previously ( single user vs multi-user ). Write performance is acceptable, but the read performance is not. I tried using 'send/recv', 'print/read', etc. without success. The client/server talk once and then hang!

      Any suggestions welcome!

    How to use the client/server code: Start the server first. Then one or more client(s), which will show the times for opening the connection, total time of the transaction and then the percentage of overhead.


    use strict; use warnings; use Time::HiRes qw( gettimeofday usleep ); use Socket; my $server_port = 3000; # make the socket socket(Server, PF_INET, SOCK_STREAM, getprotobyname('tcp')); # so we can restart our server quickly setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, 1); # build up my socket address my $my_addr = sockaddr_in($server_port, INADDR_ANY); bind(Server, $my_addr) or die "Couldn't bind to port $server_port: $!\n"; # establish a queue for incoming connections # print SOMAXCONN, "\n"; listen(Server, SOMAXCONN) or die "Couldn't listen on port $server_port: $!\n"; while ( 1 ) { my $WorkName = "FBServer"; my $cnt = 0; my $result = ""; while ( 1 ) { $cnt++; my $client; $0 = "$WorkName: Waiting for work!"; accept( $client, Server ); if ( defined $client ) { my $tm = substr(scalar localtime(),4,15); $0 = "$WorkName: Working ( $cnt ) $tm"; my $stime = gettimeofday; my $html = ""; my $Todo = "" +; my $size = 0; $0 = "$WorkName: Read Socket ( $cnt ) $tm"; eval { while (<$client>) { $Todo .= "$_"; chomp($Todo); } }; if ( $@ ) { my $err = $@; print $client "$err\n"; syslog('alert', "2 ERROR: $err"); last; } if ( $Todo ) { chomp($Todo); ## Debian patch print "$Todo ", sprintf("%0.8f",gettimeofday - $ +stime),"\n"; print $client "Client $Todo: Thank you!\n"; close ( $client ); }; } } } exit;
    use strict; use warnings; use Socket; use Time::HiRes qw( gettimeofday usleep ); our @Total = (); our $Totcnt = 500; our $remote_host = ""; our $remote_port = 3000; for ( 0..$Totcnt ) { my $parms = $$; my $result = Call_Multi_User( $parms ); if ( ! $result ) { print "\t****Nothing Back\n"; } print "$result\n"; usleep 20000; } my $open = $Total[0]; my $total = $Total[1]; my $diff = ( $total / $open ) - 1 ; $open = $Total[0] / $Totcnt; my $total = $Total[1] / $Totcnt; print "\n\tTotals:\t\t\t".sprintf("%0.8f",$open) . "\t".sprintf("% +0.8f",$total)."\t+ ".sprintf("%0.3f",$diff)." %\n\n"; exit; sub Call_Multi_User { my $Todo = shift; if ( ! defined $Todo ) { return( "" ); } our $server; my $stime = gettimeofday; my $answer = ""; # if ( ! $server ) ## if this worked then we wouldn't have t +he establish overhead! { socket( $server, PF_INET, SOCK_STREAM, getprotobyname('tcp +')); # create a socket # build the address of the remote machine my $internet_addr = inet_aton($remote_host) or die "Couldn't convert $remote_host into an Inte +rnet address: $!\n"; my $paddr = sockaddr_in($remote_port, $internet_addr); # connect connect($server, $paddr) or die "Couldn't connect to $remote_host:$remote_p +ort: $!\n"; select((select($server), $| = 1)[0]); # enable command bu +ffering } my $open = gettimeofday - $stime; print $server "$Todo\n"; shutdown($server,1); # my $no = 0; while (<$server>) { $answer .= $_; } close $server; chomp($answer); my $total = gettimeofday - $stime; my $diff = ( $total / $open ) - 1 ; $Total[0] += $open; $Total[1] += $total; $answer .= "\t".sprintf("%0.8f",$open) . "\t".sprintf("%0.8f", +$total)."\t+ ".sprintf("%0.3f",$diff)." %"; return ( $answer ); }
    Client output: Client 24006: Thank you! 0.00029397 0.00068903 + 1.34 +4 % Client 24006: Thank you! 0.00030303 0.00076103 + 1.51 +1 % Client 24006: Thank you! 0.00030303 0.00061202 + 1.02 +0 % Client 24006: Thank you! 0.00033212 0.00082612 + 1.48 +7 % Client 24006: Thank you! 0.00037408 0.00084996 + 1.27 +2 % Client 24006: Thank you! 0.00030398 0.00110912 + 2.64 +9 % Totals (Average): 0.00021664 0.00051245 + 1.36 +5 % Server output: ( 2 clients running ) 24006 0.00013709 24007 0.00002718 24007 0.00003099 24006 0.00002503 24007 0.00013399 24006 0.00002623 24006 0.00003099 24007 0.00002503 24006 0.00013494 24007 0.00002503 24007 0.00003099 24006 0.00002503 24007 0.00013399 24006 0.00002599

    Thank you

    "Well done is better than well said." - Benjamin Franklin

      That code is remarkable! Also, unfortunately badly wrong, but I'll get back to that.

      You have succeeded in writing a multi-tasking server without using any form of multi-tasking. Neither threading, nor forks, nor polling; nor an event loop!

      It is utterly, utterly amazing. It took me quite a while to understand just how it achieved that. I now understand your thread title!

      It is even more amazing that it achieves the throughput that it does; but is unsurprising that it is not meeting with your expectations.

      Your server is resolutely single tasking. And it is also quite difficult to explain how it manages to give the appearance (and indeed, the actions) of being multi-tasking, in terms of the code alone, so I'm going to resort to an analogy.

      How can you conduct two (or more) 'concurrent' conversations using one phone that has neither call-waiting; nor conferencing facilities?

      The solution is to ask the other parties to disconnect and redial after each snippet of conversation. One person rings, you say "Hello"; they hang up and redial; when you pick up they reply; then hang up and re-dial; this time when you pick up, you reply; and they hang-up and redial; and so on until the conversation is complete.

      And if two or more people follow this procedure, then you will be able to hold 'simultaneous' conversations with all of them. They'll just be very slow and disjointed conversations.

      That is exactly analogous to how your server is "working".

      I am truly surprised at how you arrived at this solution; and totally amazed at how efficiently it actually works. I guess it personifies the old adage about computers being able to do everything very, very quickly; including the wrong thing :)

      Of course, it is unsustainable for an application such as yours. You will need to use some form of multi-tasking.

      This comes in (essentially) 4 forms with the following traits:

      1. Event (select) loop.

        Ostensibly simple; lightweight, and efficient.

        The downside is that all state is global and all communications from all clients go through a single point.

        My analogy is having a single telephone and receptionist who has respond to every call and perform all work to satisfy all inbound queries and relay all outbound information.

        Works well if the inbound queries can be answered immediately with little effort, but falls down when answering a query requires more effort.

        Either every other caller has to wait while the receptionist resolves each query, no matter how long it takes; or the receptionist has to keep interrupting her efforts to resolve the query in order to service other callers.

        The first approach means that many clients will wait a long time, even if their queries are fast, whenever a hard to resolve query gets in before them.

        The second approach means that long queries take even longer, because the work effort to resolve it keeps getting interrupted by new callers.

      2. Cooroutines.

        I won't discuss this much as I consider it a retrograde step. Like going back in time to Windows 3.0. Only works if everyone cooperates; and they usually do not.

      3. Multi-processing (forking).

        Can be relatively efficient, even for long queries, because each caller gets their own process to respond to them. The downside is, that responder cannot easily communicate with the receptionist, or other responders.

        Falls down completely for write to the shared data, because the child process cannot modify the parents copy.

        Like having a modern automated switchboard where each new caller is routed directly to the next available agent. The trouble is, each agent sit isolated in their own room with only a copy of the data for reference. They can answer read-only queries, but cannot modify the data that the other agents see. And any nodifications they do make cannot be seen by the other agents.

      4. Multi-threading.

        Similar to the above, in that each caller gets their own, dedicated agent, but now all the agents are in the same room and can easily communicate between themselves. They can all make modifications to the shared data; and all can be aware of the modifications made by others.

        The downside -- for a pure Perl, iThreaded implementation -- is that the shared data is (effectively) duplicated for each concurrent client. That makes for extra memory use by the shared data and for slow(ish) inter-agent communications.

        The upside (of iThreads) is that only that data that needs to be shared is, and locking is simple (if not exactly fast); which makes it far easier to ensure that the agents don't trample on each others state accidentally and removes most if not all the potential for the classical threading nasties.

        Perl threading is far simpler than tradition, all state shared threading. The penalty you pay for that increased simplicity is in memory and performance.

      My preferred solution for your application would be a combination of two of the four. Specifically, I would run a single phone&receptionist (a select loop) within a thread. That select loop would take care of all the communications with the clients, but would hand off queries to a pool of agents (work threads).

      That allows the receptionist to respond immediately to new callers and inbound queries and modification requests from existing clients; whilst the pool of agents (work threads) take care of the doing the actual work. The pool can be tailored (scaled) to fit the available hardware (number of cores; amount of memory), on a case by case basis; whilst being able to both reference and make modifications to the shared data.

      IMO, this will deliver the best combinations of responsiveness and functionality for your scenario.

      Give me a few days and I'll get back to you with demonstrations of the 3 main candidates plus my preferred hybrid solution.

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?


          Your server is resolutely single tasking. And it is also quite difficult to explain how it manages to give the appearance (and indeed, the actions) of being multi-tasking, in terms of the code alone, ...

        And it has to be. It is the cache server and the single point of all I/O for one environment or class of database(s). All independent processes call this for reading and writing to disk. If it's a write, then it updates the cache and adds to a queue to update the record on disk (child process). If it's a read, it checks the cache and if present it returns the cached copy or does the I/O to get the record if it exists.

        Background: The Classic Problem!

        Lots of locking has to be used to cache a database between processes. This is exactly where Berkeley Db fails without a separate ( user provided ) locking mechanism. Berkeley DB uses a 4KB root that is shared by all processes, and you have a built-in race condition.

        So by forcing all cache activity into one process, and only that process locks/unlocks the cache and the related databases, you force an orderly environment. The cache server can have children that lock/unlock the tree that they are working on.

        Now the cache is actually a HoH. For instance, you could have 30 databases in one environment with 400 users and each database and each user has a set of hashes. The user hashes are mostly maintained in the client ( calling ) processes. But the database hashes can grow large, so they have frequency activity as well and data buffers. My first implementation used arrays. But Perl's 'exists' and using hashes has given the performance I wanted and needed.

        When I run benchmarks for the single-user version, the core is maxed out at 100%. When I benchmark the multi-user, the cores run at 8 to 10%, so I'm not utilizing the cache server as well as I could.

        Just to give you the correct picture.

        Thank you

        "Well done is better than well said." - Benjamin Franklin

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://968501]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2023-06-04 23:27 GMT
Find Nodes?
    Voting Booth?
    How often do you go to conferences?

    Results (22 votes). Check out past polls.