jkeenan1 has asked for the wisdom of the Perl Monks concerning the following question:

For the last two days and nights I have been struggling with a wierd bug in a development version of a module I maintain. There's probably something elementary that I'm missing, so I ask the monks' assistance. I've created a simplified version of the module that reproduces the error. (You can get a complete tarball at http://mysite.verizon.net/jkeen/perl/modules/misc/Woeisme-0.01.tar.gz.

I have a module whose constructor:

  1. Pulls in key-value pairs as arguments and stores them in %parameters.
  2. Then blesses a reference to %default_values.
  3. Then cycles through %parameters and substitutes its values on a key-by-key basis for those in the blessed hash.
  4. Does some elementary validation on some of the values with verify_values(), a subroutine which is designed to accumulate error messages as needed, then print them all out while croaking.
  5. And then returns the reference to the object.

Here is the complete package:

package Woeisme; use strict; local $^W = 1; use vars qw ($VERSION ); $VERSION = 0.01; use Carp; my %default_values = ( SUBJECT => 'Subject (<= 44 characters) goes here', AUTHOR => { NAME => 'A. U. Thor', CPANID => 'AUTHOR', WEBSITE => 'http://a.galaxy.far.far.away/modules', EMAIL => 'a.u.thor@a.galaxy.far.far.away', }, ); sub new { my ($class, %parameters) = @_; my $self = ref($class) ? bless( \%default_values, ref($class) ) : bless( \%default_values, $class ); foreach my $param ( keys %parameters ) { if ( ref( $parameters{$param} ) eq 'HASH' ) { foreach ( keys( %{ $parameters{$param} } ) ) { $self->{$param}{$_} = $parameters{$param}{$_}; } } else { $self->{$param} = $parameters{$param}; } } $self->verify_values(); return ($self); } sub verify_values { my $self = shift; my (@errors); push( @errors, 'NAME is required' ) unless ( $self->{NAME} ); push( @errors, 'Module NAME contains illegal characters' ) unless ( $self->{NAME} and $self->{NAME} =~ m/^[\w:]+$/ ); push( @errors, 'SUBJECTs are limited to 44 characters' ) if ( length( $self->{SUBJECT} ) > 44 ); push( @errors, 'CPAN IDs are 3-9 characters' ) if ( $self->{AUTHOR}{CPANID} !~ m/^\w{3,9}$/ ); push( @errors, 'EMAIL addresses need to have an at sign' ) if ( $self->{AUTHOR}{EMAIL} !~ m/.*\@.*/ ); push( @errors, 'WEBSITEs should start with an "http:" or "https:"' + ) if ( $self->{AUTHOR}{WEBSITE} !~ m/https?:\/\/.*/ ); if (@errors) { croak( join "\n", @errors, '', $!); } } 1;

I'm interested in testing for the cases where bad data is passed to the constructor. Do the proper error messages then appear? Since the real-world module displaying this problem creates directories and files, I try to test each distinct error in its own environment: a temporary directory set up with File::Temp. I have written failsafe() as a wrapper around the creation of the tempdir, the call to the constructor and the test. Here is the test file:

# -*- perl -*- # t/001_load.t - check module loading and create testing directory use Test::More qw{no_plan}; BEGIN { use_ok( 'Woeisme' ); } BEGIN { use_ok( 'File::Temp', qw| tempdir |); } BEGIN { use_ok( 'Cwd' ); } my $object = Woeisme->new ( 'NAME' => 'ABC::XYZ', ); isa_ok ($object, 'Woeisme'); failsafe( [ 'NAME' => 'ABC::XYZ', 'SUBJECT' => '123456789012345678901234567890123456789012345', ], "^SUBJECTs are limited to 44 characters", "Constructor correctly failed due to SUBJECT > 44 characters"); failsafe( [ 'NAME' => 'GHI::DEF', 'AUTHOR' => { NAME => 'James E Keenan', CPANID => 'ABCDEFGHIJ', }, ], "^CPAN IDs are 3-9 characters", "Constructor correctly failed due to CPANID > 9 characters"); sub failsafe { my ($argslistref, $pattern, $message) = @_; my $odir = cwd(); my ($tdir, $mod); $tdir = tempdir( CLEANUP => 1); ok(chdir $tdir, 'changed to temp directory for testing'); local $@ = undef; eval { $mod = Woeisme->new (@$argslistref); }; like($@, qr/$pattern/, $message); local $@ = undef; ok(chdir $odir, 'changed back to original directory after testing') +; }

Since there's only a single test file, once I have maked the package I can test it with either make test or prove -vb t/001_load.t.

Now, here's the problem. If I comment out one or the other of the two failsafe blocks so as to test just one feature at a time, there is no problem. But if I test the two in succession, the like test in the second block fails. It fails only because the wrong message is generated. Instead of getting an error message which begins

CPAN IDs are 3-9 characters

... I'm getting one that first repeats the error message from the previous call to failsafe(), and only then displays the expected error message:

SUBJECTs are limited to 44 characters CPAN IDs are 3-9 characters

I'm getting an error message about the SUBJECT even though I haven't declared any new SUBJECT (and, hence, the subject from %default_values should still prevail -- and that one was less than 44 characters in length). If you sprinkled some print statements around, you'd see that the value of SUBJECT in the second set of tests is 123456789012345678901234567890123456789012345 -- but shouldn't that value have been wiped out (a) because the first constructor failed and (b) because the first set of tests was properly scoped.

Given that each set of tests is limited to the scope of failsafe() and that each set is taking place in its own tempdir, I cannot see any reason why, when I run both test blocks in succession, the value of SUBJECT in the first lingers into the second and eventually causes the wrong error message to print out, causing like to generate a false value.

Any ideas? Thanks in advance.

Replies are listed 'Best First'.
Re: Wrong Error Message When Testing for Failure of Constructor
by davidrw (Prior) on Jul 22, 2005 at 02:31 UTC
    The root cause becomes apparent with these debugging statements in the constructor:
    warn "default_values: " . \%default_values; my $self = ref($class) ? bless( \%default_values, ref($class) ) : bless( \%default_values, $class ); warn "self: " . $self;
    The constructor isn't copying the %default_values hash but is simply referencing it, and then modifying it. So when the second obejct is created and the constructor run again, it's pointing at the same hash, the contents of which are now different due to the modifications from the first run of the constructor. To fix, simply do this to make a copy first (i tested it and your tests will all pass):
    my %h = %default_values; my $self = ref($class) ? bless( \%h, ref($class) ) : bless( \%h, $class );
    Also, as a side note, note that this is identical logic w/o the repeated code:
    my %h = %default_values; my $self = bless( \%h, ref($class) || $class );
    Update: Combining w/tlm's solution that does away w/the need for the temp var %h it can just be:
    my $self = bless( +{ %default_values }, ref($class) || $class );
      davidrw wrote:

      The root cause becomes apparent with these debugging statements in the constructor:

      warn "default_values: " . \%default_values; my $self = ref($class) ? bless( \%default_values, ref($class) ) : bless( \%default_values, $class ); warn "self: " . $self;

      Thanks for reminding me about a trick I knew several years ago but had forgotten!

Re: Wrong Error Message When Testing for Failure of Constructor
by tlm (Prior) on Jul 22, 2005 at 02:31 UTC

    I think the problem is that you are taking a reference to the %default_values hash, instead of making a copy of it. Try this:

    my $self = ref($class) ? bless( +{ %default_values }, ref($class) +) : bless( +{ %default_values }, $class );
    In general,
    +{ %hash }
    makes a (shallow) copy of %hash and returns a reference to the copy. (The leading + is not strictly necessary, but there are situations in which perl misconstrues the curlies as a block instead of an anonymous hash(ref) constructor, so I've gotten in the habit of disambiguating such expressions with a unary +.)

    Likewise,

    [ @array ]
    makes a (shallow) copy of @array and returns a reference to the copy.

    the lowliest monk

      I think the problem is that you are taking a reference to the %default_values hash, instead of making a copy of it.

      The point made by both davidrw and tlm is well taken. I tried it out by rewriting the relevant part of the constructor as follows:

      warn "DV: " . \%default_values; my $self = ref($class) ? bless( +{ %default_values }, ref($class) +) : bless( +{ %default_values }, $class ); warn "self: " . $self;

      It worked ...

      ... at first.

      Then it didn't.

      To keep my original posting to a reasonable length, I only included two test blocks in t/001_load.t. With the suggested revisions to the constructor, both passed.

      However, the real-world module of which Woeisme is a distillation included additional test block both before and after the two listed originally. I first added those tests which, in the real-world test suite, occurred before the two cited.

      All tests passed. I added one more test similar to the last one.

      All tests still passed. But when I started to add more tests testing values for the other keys of the AUTHOR hash, I started to get exactly the same type of error as I reported yesterday. Adding one test:

      failsafe( [ 'NAME' => 'ABC::Alpha', 'AUTHOR' => { NAME => 'James E Keenan', EMAIL => 'jkeenancpan.org', }, ], "^EMAIL addresses need to have an at sign", "Constructor correctly failed; e-mail must have '\@' sign");

      ... led to this error:

      not ok 22 - Constructor correctly failed; e-mail must have '@' sign # Failed test (t\001_load.t at line 88) # 'CPAN IDs are 3-9 characters # EMAIL addresses need to have an at sign # # No such file or directory at t\001_load.t line 87 # ' # doesn't match '(?-xism:^EMAIL addresses need to have an at sign) +'

      Note that in the last test block, I was not testing the CPANID key; only the NAME key (which should have PASSed) and the EMAIL key (which should have been the sole FAIL). There shouldn't have been any reference at all to the CPANID key.

      So it appears we have not yet found the bug! Thanks for those who took the time to look at this. Any other ideas?

        I suspect the reason is fundamentally the same. I missed that you have references among the values of %default_values, meaning that you need a deep copy (not the shallow copy originally proposed). Try this quick fix (just to confirm the hypothesis):

        my %default = %default_values; # shallow copy $default{ AUTHOR } = +{ %{ $default_values{ AUTHOR } } }; # copy next level my $self = ref($class) ? bless( \%default, ref($class) ) : bless( \%default, $class );
        Ultimately, my preferred code for this would be something like
        my $self = bless deep_copy( \%default_values ), $class;
        where deep_copy takes a reference as argument and returns a reference to a deep copy of it.

        I know that there are CPAN modules for doing deep copying, but I have no direct experience with any of them, so I can't recommend any one in particular. Maybe some other monk can.

        the lowliest monk