in reply to POE program running into sporadic segfault

That looks like Curl created its own thread (in a non-threaded perl?) and then sent a signal and Perl's signal handlers ran in the context of Curl's thread instead of Perl's main thread. This is definitely a bug of some sort because Perl's signal handlers must always run in (one of) Perl's threads.

I don't know what exactly the solution is here, because I don't know the inner workings of libcurl, but I suspect you need to force curl's thread to block all signals.

It's an amazing coincidence you would ask this today, because I *just* fixed the same kind of bug last night in my module IO::SocketAlarm. In that code, I was creating a second thread unknown to perl and sending a signal with the intent that Perl's main thread would catch it. It worked on Linux, but on FreeBSD the thread would catch its own signal (using Perl's signal handlers which must be run in the main thread) and die with a segfault, just like your code.

Assuming you don't want to dig into the XS of the module that gives you libcurl, the next-best way to solve the problem is to block all signals in the main thread prior to starting your curl operations, then unblock afterward. New threads inherit a copy of the signal mask, so then that curl thread will have all signals blocked even after you re-enable them in the main thread. (and you need them enabled in the main thread to catch things like SIGCHLD which wakes POE up to reap child processes)

There's a chance I'm wrong here if libcurl expects to be able to receive signals in its thread as part of normal operation. If that's the case, I don't know what the solution is, other than maybe moving the libcurl stuff to a separate process.

Edit: Or, try using a threaded perl. A threaded perl I think would have to account for the signal handler running in a random thread, so when compiled for threads it probably uses appropriate synchronization techniques to deliver the signal.

Update: Looking at my code made me realize I should take my own advice and change the signal mask before starting the thread, to avoid a tiny race condition of the thread receiving a signal before it runs pthread_sigmask.

  • Comment on Re: POE program running into sporadic segfault

Replies are listed 'Best First'.
Re^2: POE program running into sporadic segfault
by etj (Priest) on Aug 29, 2024 at 17:15 UTC
    To add to these excellent points, and possibly to help repro the crash: the signals being sent to the thread case will (I believe) almost certainly be a SIGALRM, to implement a timeout (and such wouldn't be maskable from outside since it's fundamental to how I assume curl works). Therefore, to repro it, you would need a web service that reliably times out. Test::Mojo is very helpful in creating such things.
      My money is still on SIGCHLD, since OP says this seems related to the act of shelling out to tar. SIGALRM is kind of an old-school Unix design, where I expect most modern event-driven libraries will be using select() or poll() with the built-in timeout parameter. SIGCHLD is still very actively used to break out of one of those blocking poll() sleeps when it's time to reap a child process.