in reply to SIGHUP with Proc::Daemon

I've tried changing the code inside to something simpler

Have you tried setting just a (global) flag variable in the handler, and then doing the reloading elsewhere in the daemon code?  (presumably, you have some loop where you could check the flag...)

Doing a lot of stuff directly in the signal handler sometimes isn't the best idea.

To find out more about why it dies, you might want to attach strace (or the equivalent tool on your platform) to the daemon process before you send it the signal (strace -p <PID>).

Replies are listed 'Best First'.
Re^2: SIGHUP with Proc::Daemon
by jaandy (Acolyte) on Mar 09, 2012 at 19:06 UTC
    Unfortunately I don't think that will be possible. I'm using zeromq in my daemon so the main loop looks like:
    while (1) { my $msg = $sock->recv(); my $data = $msg->data(); unless ($csv->parse($data)) ... etc ... }

    The $sock->recv() blocks until there is a new message.

    Hum... I think I'll try putting zeromq into my simple test app. Maybe its not daemons stuff that's causing me problems, maybe its zeromq stuff. Thanks for the idea.

    -Andy
      Woot! Its zeromq stuff. My new test app:

      #!/usr/bin/perl use common::sense; use ZeroMQ ':all'; use Data::Dumper; my $ctx = ZeroMQ::Context->new(); my $sock = $ctx->socket(ZMQ_PULL); $sock->setsockopt(ZMQ_HWM, 500); $sock->bind('tcp://*:5558'); $SIG{HUP} = sub { print "got hup\n"; }; while (1) { my $msg = $sock->recv(); print "got message\n"; print Dumper(\$msg); }

      after I sighup it, I get:
      got hup got message $VAR1 = \undef;


      In real code, I call $msg->data() next which when $msg is undef I can see that killing the script.

      Its kinda weird that SIGHUP makes $sock->recv() return, but at least I can code around that.

      thanks again,

      -Andy
        Its kinda weird that SIGHUP makes $sock->recv() return

        Actually, in this particular case, it's a plus, because it gives you the opportunity to do the reloading of the config before restarting the recv.

      System calls that may take a "long time" (such as accept, recv, etc.) are normally interrupted by a signal, and in case resuming isn't handled automatically (some systems don't), it might well be the cause of the problem (e.g. subsequent code failing due to not having the expected data...)

      See also signal(7) (section "Interruption of System Calls")