Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

what causes a segmentation violation

by fhew (Beadle)
on May 03, 2007 at 19:56 UTC ( [id://613464]=perlquestion: print w/replies, xml ) Need Help??

fhew has asked for the wisdom of the Perl Monks concerning the following question:

My latest app, is a forking TCP server/client. It runs fine and does its thing. So I put it into a shell loop, and every once in a while I see it die (near or at the end of processing where its exiting) and say:

"Segmentation fault"

My guess is its happening after my app says exit, and Perl is doing its clean up. But I'd rather never see the error message. ISTR that this isn't the first app of mine that randomly emits this message on shutdown, but this time around its anoying me. How can I find out where/why its dying?

B.T.W. perl -v says: "This is perl, v5.8.8 built for i386-linux-thread-multi"

Replies are listed 'Best First'.
Re: what causes a segmentation violation
by BrowserUk (Patriarch) on May 03, 2007 at 22:26 UTC

    Suggestion:

    Try to isolate when it happens. Ie. add a warn statement just before your exit. (warn not print or disable buffering on stdout.)

    If that is printed before you see the segfault, it is probably happening when perl is trying to free some memory.(*)

    One way to verify that is to use POSIX::_exit() instead of exit. That will bypass perl's cleanup of memory. If that avoids the segfault, don't stop there. Try and isolate the cause further.

    One way to do that is to explicitly undef everything that exists in your program at point where you normally exit. Go through your program and explicitly call undef( $var ) on every object and datastructure in your program. If doing so causes the segfault before you see your previously added warn 'Exiting' message, you've probably found the culprit so it's just a case of isolating which it is.

    Comment out half of your undefs and try again. If you still get the segfault before the exiting, it's in the other half, so comment out half of the remaining undefs.

    If not, uncomment the first half and comment out the second.

    Continue that until (with luck), you get down to the one undef that causes the segfault. If you are successful in so isolating the cause, try and recreate the problem in a small, standalone program using whatever module or datastructure is at causing it.

    If the above procedure fails to isolate a cause, then the next step would be to cut stuff out of the program bit by bit until the segfault no longer happens. Or, you could look to building a debug version of perl and/or using a debugger to track down the cause, but that requires a whole host of other skills.

    (*)If not, litter the program with warn statements to see where the segfault is occuring. Another method might be to run the program under that auspices Devel::Trace in the hope that might show you where the problem is occuring. Be warned, the program will run very, very slowly.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: what causes a segmentation violation
by rinceWind (Monsignor) on May 04, 2007 at 10:17 UTC

    What is happening with a segfault is that the CPU is trying to use an invalid address. While this can potentially have many causes, the two most common are:

    Dereferencing an invalid pointer (typically a null pointer). You need to think in C rather tham perl. Buggy code that tries to dereference a null pointer results in a segfault and core dump. This doesn't happen with perl references as they maintain their own integrity. But if you are calling out to C libraries via XS, problems with the XS code or external C library could give you this behaviour.

    Overwriting memory. Examples are the buffer overflow attacks one reads about in security journals. If something unexpected has been written over data structures, this may result in invalid pointers (see above), or worse, something may have overwritten code, though this may result in a different exception to segfault.

    Coming back to your problem, it will almost certainly be in one of two places: external C code or perl itself. I recommend that you follow the debugging guidelines given by BrowserUK, and try and isolate the line of perl code where the segfault is happening.

    Once you have this information, report a bug to http://rt.cpan.org if it's in a module's XS code, or use perlbug to report the bug if it's in perl itself.

    --
    wetware hacker
    (Qualified NLP Practitioner and Hypnotherapist)

      I suppose I should have been more specific when I asked about 'what causes a segmentation violation', to avoid getting the obvious description of what it is (of which I am sadly, well aware).

      I took the suggestion of looking at what the core dump could provide. A gdb stack trace told me that it was dying during signal handling. (Unfortunately I didn't write down the exact routine name, but... So without attempting to understand and diagnose Perl itself, I looked at where my code dealt with signals. I narrowed it down to 'death of a child' in my forking TCP server code.

      My server is bsed on the standard skeleton from 'the cookbook'. What happens in my code is that on occasion, the main listener may choose to shut itself down and all children (for example, say when it catches a CTRL-C) The mainline would go through and kill off all forked children, and then die itself.

      What was happening was (or at least _my impression_ was) that the children would be killed off, but not dead yet. Then the mainline would die, but the perl interpreter would still be around. Then the interpreter for the main-line would receive the 'death of a child' signal for (one or more of) the children, but could no longer handle it, because the mainline was (just about...!) dead. So it would issue a segfault and core dump.

      My solution/workaround under this situation was to signal the children to die, and then have the mainline actually wait around (perhaps forever) for the children to die, and then (and only then) exit/die itself. I.e.

      sub reaper { 1 until (-1 == waitpid(-1, WNOHANG)); } $SIG{CHLD} = \&reaper; kill 9, $childPid; # signal the child to die reaper(); # IMPORTANT! ...wait for the child to go away # (else we might get perl seg faulting on exit) exit; # and die ourselves

      So in the end, I was seeing this on a number of my apps that have used the same philosophy on shutdown of forking server apps, and the reason it was intermittent failure/warnings was all due to the random timing of the parent/child dying/exiting relationship.

      ...Sometimes the signal catcher would get invoked... but sometimes the interpreter seemed to have been shutdown far enough that the catcher was no longer there when the signal arrived, so it would core dump on shutdown.

Re: what causes a segmentation violation
by djp (Hermit) on May 04, 2007 at 07:18 UTC
    Your segmentation violation should produce a core file ('core' in the current working directory of the process). Use 'file core' to find out the name of the program which produced the core dump, probably perl in your case. Then use a debugger (gdb, dbx, etc.) giving that program name and the core file name as arguments. A stack backtrace via 'where' or similar should show you where the segmentation violation occurred, after that you're on your own. You may need to tweak some permssions and/or settings to get SIGSEGV to produce a core file. 'man core' for further details.
Re: what causes a segmentation violation
by Herkum (Parson) on May 03, 2007 at 21:04 UTC

    Segmentation fault is usually a problem with the Perl interpreter. What specifically it is, I don't know. It could be a problem with the interpreter itself, highly unlikely. It can also be a problem a library that the code links too, or a problem with the system itself.

    It would be impossible to know the specifics without more details or even how problematic it would be. It could a segment fault with no consequences or something catastrophic.

Re: what causes a segmentation violation
by xdg (Monsignor) on May 03, 2007 at 22:41 UTC

    I'm not sure how to help you diagnose this, but you might consider whether you really want to be forking on Windows. See perlfork for details -- it's really just threads made to look like forks. You might have more success with threads that look like threads, instead.

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

      Its on Linux... not Windows (thank God!)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://613464]
Approved by Old_Gray_Bear
Front-paged by rinceWind
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-04-25 23:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found