Hello thelonious monks!

I have an issue that I'm having a terrible time figuring out. I have a small server app that listens on a socket and feeds a list of server names to a client that connects (this is internal to my network). The application retrieves it's list of servers to serve from another perl script that does some health testing on the allowable machines and returns an ordered list to this server app.

Ok, so it is all working really well, except for one little thing. The program that does the checking causes a little load on the servers that it checks, and this "server server" can get a pretty good load up on it. I decided that it was necessary to make the "server server" cache the content that it is serving for up to 10 seconds, and to refresh it's list if it gets a request where the data is more than 10 seconds stale.

So, this works for some of the listening servers, but not all. The "last_update_time" is stored at a global level for the server (although I've tried many ways of setting it, including passing a reference to it) and there is one or more servers that do not update the time no matter what I've tried. There is extensive logging going on, and it is very difficult to tell which is the parent and which is the child and which is not handling the requests as the pid changes with each request (I don't fully understand the module implementation, but it appears to be very apache-like in nature).

Either way, here is the server code:

#!/usr/local/bin/perl -w package speak_db_names; use Net::Server::PreForkSimple; @ISA = qw{Net::Server::PreForkSimple}; use strict; use Time::Format qw{%time}; use Data::Dumper; use File::Basename; my $basedir = dirname($0); my $refresh_time = 10; my $last_server_update; my $servers = [ ]; $servers = load_data($servers, undef); speak_db_names->run(conf_file => 'speak_db_names.conf'); ## sub process_request {{{ sub process_request { my $self = shift; $servers = load_data($servers, $self); eval { local $SIG{ALRM} = sub { die "Timed Out" }; my $timeout = 2; my $previous_alarm = alarm($timeout); print join( ',', @$servers); alarm($timeout); } } ## }}} ## sub load_data {{{ sub load_data { my $servers = shift; my $self = shift; unless ($last_server_update) { $last_server_update = 0 } if ( (time - $last_server_update) > $refresh_time ) { Net::Server::write_to_log_hook($self, 3, 'DB Cache expired: ' . (time - $last_server_ +update) . ' Seconds Old... Relo +ading'); Net::Server::write_to_log_hook($self, 3, 'Now: ' . time . " Last Upd: $last_ser +ver_update"); my $ret = `$basedir/check_database_health.pl .1 2>>/var/log/ch +eck_db_names.log`; $last_server_update = time; @$servers = split(/\t/, $ret); return $servers; } else { Net::Server::write_to_log_hook($self, 3, 'Reusing DB Cache ' . (time - $last_server_ +update) . ' Seconds old'); return $servers; } } ## }}} 1;
Here is the config file for the Net::Server::PreForkSimple:
min_spare_servers 1 min_servers 2 max_spare_servers 2 max_servers 3 max_requests 1000 check_for_dead 120 log_file /var/log/speak_db_names.log log_level 3 pid_file /tmp/speak_db_names.pid background 1 port 3307 child_communication 1
and here is a snippet of log file showing the issue:
Starting "1" children Starting "1" children 2005/11/28-19:43:35 CONNECT TCP Peer: "172.17.10.57:36685" Local: "172 +.17.10.75:3307" DB Cache expired: 474444 Seconds Old... Reloading Now: 1133225015 Last Upd: 1132750571 2005/11/28-19:43:36 CONNECT TCP Peer: "172.17.10.57:36688" Local: "172 +.17.10.75:3307" DB Cache expired: 474445 Seconds Old... Reloading Now: 1133225016 Last Upd: 1132750571 2005/11/28-19:43:36 CONNECT TCP Peer: "172.17.10.57:36689" Local: "172 +.17.10.75:3307" DB Cache expired: 474445 Seconds Old... Reloading Now: 1133225016 Last Upd: 1132750571 2005/11/28-19:43:36 CONNECT TCP Peer: "172.17.10.57:36690" Local: "172 +.17.10.75:3307" Reusing DB Cache 1 Seconds old 2005/11/28-19:43:36 CONNECT TCP Peer: "172.17.10.57:36691" Local: "172 +.17.10.75:3307" Reusing DB Cache 0 Seconds old 2005/11/28-19:43:37 CONNECT TCP Peer: "172.17.10.57:36692" Local: "172 +.17.10.75:3307" Reusing DB Cache 1 Seconds old 2005/11/28-19:43:37 CONNECT TCP Peer: "172.17.10.57:36693" Local: "172 +.17.10.75:3307" Reusing DB Cache 1 Seconds old 2005/11/28-19:43:38 CONNECT TCP Peer: "172.17.10.57:36694" Local: "172 +.17.10.75:3307" Reusing DB Cache 2 Seconds old Starting "1" children Starting "1" children 2005/11/28-19:43:38 CONNECT TCP Peer: "172.17.10.57:36695" Local: "172 +.17.10.75:3307" Reusing DB Cache 2 Seconds old 2005/11/28-19:43:39 CONNECT TCP Peer: "172.17.10.57:36696" Local: "172 +.17.10.75:3307" Reusing DB Cache 3 Seconds old 2005/11/28-19:43:39 CONNECT TCP Peer: "172.17.10.57:36697" Local: "172 +.17.10.75:3307" Reusing DB Cache 3 Seconds old 2005/11/28-19:43:40 CONNECT TCP Peer: "172.17.10.57:36698" Local: "172 +.17.10.75:3307" Reusing DB Cache 4 Seconds old 2005/11/28-19:43:40 CONNECT TCP Peer: "172.17.10.57:36699" Local: "172 +.17.10.75:3307" Reusing DB Cache 4 Seconds old 2005/11/28-19:43:40 CONNECT TCP Peer: "172.17.10.57:36700" Local: "172 +.17.10.75:3307" DB Cache expired: 474449 Seconds Old... Reloading Now: 1133225020 Last Upd: 1132750571 2005/11/28-19:43:41 CONNECT TCP Peer: "172.17.10.57:36701" Local: "172 +.17.10.75:3307" DB Cache expired: 474450 Seconds Old... Reloading Now: 1133225021 Last Upd: 1132750571

As you can see, the instances that do not update the time have never updated it from the initial load of the daemon. The others update it all the time.

Please help! It seems that more often than not the server that does not update the time gets hit, which causes both a strain on the machines being checked as well as a longer connection time waiting for the check to occur.

Thanks so much for your help!

-Hubb0r


In reply to Net::Server::PreForkSimple and Children by hubb0r

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.