in reply to Re: Tracking down a segfault
in thread Tracking down a segfault

Well, the oddity to me is that we're not passing bad parameters. Or, at least, that's what everything seems to say. The Perl error expressly tells us that we're providing the necessary params correctly, and printing them out immediately before the call shows the same thing.

The error is caused by the fact that DateTime.pm somehow loses the parameters that we passed it. We pass them, and I hacked a print line into DateTime.pm to prove that it was indeed receiving the params correctly in @_. However, DateTime.pm then processes the arguments using Params::Validate. So, when Params::Validate returns the object, it's empty, with no error!

As for why it's in an eval? I think that's because it's important that the code doesn't die (I didn't actually write it, so I'm guessing here). The library is used in a few places, one of them processing human input, which occasionally is input with crap. So I believe without the eval, junk input can result in an exit from DateTime, which shouldn't cause the program to fail.

We're suspicious about the fact that it's time zone code thanks to the fact that there was the recent problem with the Olson DB, which might have causing this, but it... doesn't seem like the problem. At least not offhand. We re-processed the Olson data a few times (using different versions), but nothing seems to solve the error.

Our guess for the time being is that the error is further upstream somewhere, and perhaps something in the stack is being screwed up, and it's only getting to the breaking point when it gets down to the TimeZone level.

The whole environment is ... interesting. As noted, it's a very old Perl version (we've been denied in trying to have them upgrade it, and some libraries we maintain ourselves, not maintained by the sysadmins). But I'm still at a loss, since, well, you're not supposed to get actual segfaults!

DaveE

Replies are listed 'Best First'.
Re^3: Tracking down a segfault
by Anonymous Monk on Oct 26, 2011 at 03:19 UTC

    It might be something trivial, but I noticed one thing in what you've said:

    I added in debug (that still contained the failure) to print out @_, %p, and $NewValidate both before and after the call to Params::Validate. Result?

    The output you posted doesn't show anything for $NewValidate?

    You may have just skipped posting it for the sake of brevity, but I did notice the omission of a dump for $NewValidate.

    What versions of Params::Validate and DateTime:: are you using?

    http://search.cpan.org/dist/Params-Validate/
    From the newest v1.00 docs/source of Params::Validate::validate(), the first argument is @_ and the second is a HASH or HASHREF. The oldest perl installation I have over here is v5.10.1 along with Params::Validate v0.91 --It's as close as I can get to what you're using over there. At least with the oldest stuff I have here, when the second argument to validate() is an empty HASHREF (validation spec), I get a fatal error (carp).

    #!/usr/bin/perl -w use strict; use diagnostics; use warnings FATAL => 'all'; use Data::Dumper; $Data::Dumper::Useperl = 1; $Data::Dumper::Indent = 1; $Data::Dumper::Sortkeys = 1; $Data::Dumper::Useqq = 1; $Data::Dumper::Deparse = 1; use Params::Validate qw(:all); sub foo { validate( @_, { 'bar' => 1, # mandatory 'baz' => 0, # optional } ); print "Hello Nurse!\n"; } foo('bar' => "arg1"); sub qux { validate( @_, {} ); print "Empty Hash\n"; } qux('test');
    Output:
    $ perl crap2.pl Hello Nurse! Uncaught exception from user code: Odd number of parameters in call to main::qux when named param +eters were expected at crap2.pl line 25 main::qux('test') called at crap2.pl line 31 at crap2.pl line 25 main::qux('test') called at crap2.pl line 31

    I haven't actually used the Params::Validate (or its sister Config::Validate) recently, but I (quickly) read all of the code for the newest version a two days ago. Purely for fun, I've been looking into validation in a general sense, so they're on my "must know" list. I haven't read DateTime in a *very* long time, but I'll look at it for ya.

    As for the use of eval to prevent termination or warnings, one really *must* have a Darn Good Reason (DGR) to use it, and the smart thing to do is to use a local() $SIG{__DIE__} and/or $SIG{__WARN__} to handle the expected issues. Since you've posted the previous failure (e.g. error on 2010, not the 2011 segfault), it seems you're doing some kind of signal trapping but where and how is a mystery. You can augment your eval with local signal traps to give you more info:

    ... eval { local $SIG{__WARN__} = sub { print STDERR "HOOKED __WARN__\n"; print STDERR Carp::longmess(); return(); }; local $SIG{__DIE__} = sub { print STDERR "HOOKED __DIE__\n"; print STDERR Carp::longmess(); exit(1); # or in your case return() }; # Examples - message not printed due to hooks. # CORE::warn("Warn Message:\n", @_, "\n"); # CORE::die("Died Message:\n", @_, "\n"); $DT = DateTime->new( year => $year, time_zone => $tz ); }; return if $@; ...

    It seems my guess was correct about DateTime::TimeZone using the local system, particularly the local timezone database. I suspect this is the root cause to of the segfault.

    http://search.cpan.org/~drolsky/DateTime-0.70/lib/DateTime.pm#Time_Zone_Warnings
    Determining the local time zone for a system can be slow. If $ENV{TZ} is not set, it may involve reading a number of files in /etc or elsewhere. If you know that the local time zone won't change while your code is running, and you need to make many objects for the local time zone, it is strongly recommended that you retrieve the local time zone once and cache it

    ...

    http://search.cpan.org/~drolsky/DateTime-0.70/lib/DateTime.pm#Constructors
    The time_zone parameter can be either a scalar or a DateTime::TimeZone object. A string will simply be passed to the DateTime::TimeZone->new method as its "name" parameter. This string may be an Olson DB time zone name ("America/Chicago"), an offset string ("+0630"), or the words "floating" or "local".

    From the above you have some choices to test while keeping the exact same functionality, while potentially avoiding a system-based timezone problem.

    1. check /etc/localtime (BSD) -I'm not sure of linux equivalent?
    2. Set LC_TIME for your locale (checking your locale might help)
    3. Set TZ in your environment (prevent system lookup)
    4. Set $ENV{TZ} in your code (prevent system lookup)
    5. Cache the time zone (details in first link above)
    6. Try using a DateTime::TimeZone object (rather than a string) in your call DateTime->new() --I'd put this in your eval as below.

    ... eval { $tz = DateTime::TimeZone->new( name => 'Europe/London' ); $DT = DateTime->new( year => $year, time_zone => $tz ); }; return if $@; ...

    I'm not familiar with timezone handling on linux, but on BSD, I'd check to make sure there's a link between /etc/localtime to the appropriate time zone file in th TZ database. A modification to the local TZ database without fixing /etc/localtime can cause a real mess.

    The root cause of the segfault is most likely in XS code not playing nicely with your system code. On earlier versions of Params::Valiadate (like the v0.91 I've got), there's code to exclude XS usage on earlier perls, but newer versions (like current) have changed this. More likely, something in XS code of DateTime or DateTime::TimeZone is making bad calls into your system time code, and/or timezone code/database. The first problem you have on 2010 where you get an error could be corrupting things, and the second on 2011 could be the last straw. If none of the above listed stuff fixes your problem, I'd try upgrading just the modules; DateTime, DateTime::TimeZone, Params::Validate.

    As for why this bug has surfaced on "previously running" code, it could be related to recent changes made in your linux distro time zone database (or code) due to the recent lawsuit.
    http://en.wikipedia.org/wiki/Tz_database#2011_lawsuit
    I'm betting you're in a "managed" environment with automated system updates at work changing your TIME/TZ stuff out from under you.