inconistent test results when using is_deeply to test an array returned from a function

APGRMF has asked for the wisdom of the Perl Monks concerning the following question:

I've written a unit test script to test various functions in a process. All seems OK - if I run the unit test script all tests pass. However, if I run it a number of times it ends up failing.

The behavior is inconsistent and I don't understand why. I haven't written that many perl scripts or tests so I'm sure my understanding is at fault somewhere.

Anyway, I've narrowed the issue down to a single unit test that uses is_deeply to test an array returned by the function "get_files_in_watched_directory" against an array populated with the expected result.

The error message states that the "The structures begin differing at filename line 85" which is the end of the "get_files_in_watched_directory" subtest. If I print the expected and actual array contents to the screen they appear to be identical.

I'm clearly missing something.

Any thoughts would be greatly appreciated.

unit test code


use strict;
use warnings;

use Test::More tests => 2; #'no_plan'; 

use Test::Exception;
use Test::Builder;
use Test::Directory;
use File::Temp qw( tempfile tempdir );
use File::Basename;
use File::Path;


my $builder = Test::More->builder->output('results.txt');


# Call the perl script under test
note "Check perl script under test loaded OK";
ok( require( '../apg-problem-perl.pl' ), 'loaded file okay' ) or exit;


note "\nTest function: get_files_in_watched_directory";
subtest 'get_files_in_watched_directory' => sub {
  my @got_files;
  my @expected_files;
  my $project_name = 'default';

  # create a temporary directory  
  my $temp_dir = tempdir( CLEANUP => 1 );

  # create some temporary files in the directory 
  my ($temp_file1_handle, $temp_file1)  = tempfile(DIR => $temp_dir);
  my ($temp_file2_handle, $temp_file2)  = tempfile(DIR => $temp_dir);
  
  # create an array populated with the names of the temporary files th
+at we expectd to see
  push (@expected_files, basename($temp_file1));
  push (@expected_files, basename($temp_file2));
  
  # now pass the temporary directory to the the function
  # the function should return an array @got_files with the names of t
+he files in the directory 
  @got_files = get_files_in_watched_directory($temp_dir, $project_name
+);
  
  message_log ("\n\nGOT", "high");
  foreach my $this_file (sort @got_files) {
    message_log ("\n  $this_file", "high");
  }

  message_log ("\n\nEXPECTED", "high");
  foreach my $this_file (sort @expected_files) {
    message_log ("\n  $this_file", "high");
  }
  
  #compare the two arrays
  is_deeply(\@got_files, \@expected_files, 'Function should return a l
+ist of directory contents that will be processed');

};


done_testing();
[download]

Code under test

use strict;
use warnings;
use Getopt::Std;
use FileHandle;
use File::Find;
use File::Path;
use File::Copy;
use File::Basename;

  use Archive::Zip qw( :ERROR_CODES :CONSTANTS);

use XML::LibXML;
use XML::Simple;
use XML::Parser;
use Data::Dumper;
use DateTime;
use Net::FTP;

######################################################################
+#################################
# disable output buffering (I think)
$|=1;

######################################################################
+#################################
#
# GLOBAL VARIABLES
#
######################################################################
+#################################



my $path_to_process ;
my $ftp_target_folder;
my $project_name;
my $process_root_dir;
my $target_ftp_site;


my $in_progress_dir;                 # location of zipped content to b
+e sent via FTP
my $archived_dir;                    # location of zipped content alre
+ady sent via FTP
my $archived_success_dir;            # root location of processed file
+s
my $archived_failed_dir;             # root location of processed file
+s
my $archived_not_attempted_dir;

my $Fail_Message_Prefix;             # Standardised message following 
+a script failure

my $log_dir;                         # location of logs files 
my $log_file;                        # log file name
my $log_fh;                          # log file handle


my $lock_fh;                         # "in progress" file handle 

my $environment;                     # wkuk environment type: live, qa
+, test, test-local
my $server;                          # wkuk server type: ukkinhmrc01, 
+productionbuild, localhost
my $process_root_path;               # process root path
my $dctm_export_path;                # documentum smart content export
+ path
my $debug_default_mode = "normal";   # debug mode
my $debug_mode;                      # debug mode

my $datestamp;


######################################################################
+#################################
#
# INITIALISE VARIABLES
#
######################################################################
+#################################

$Fail_Message_Prefix = "\n\nScript Failure:\n";


######################################################################
+#################################
#
# START 
#
######################################################################
+#################################
# call main subroutine eveything is kicked off from there
main( @ARGV ) unless caller();


######################################################################
+#################################
#
# Main subroutine
#
######################################################################
+#################################
sub main {
}

######################################################################
+##################################
sub get_files_in_watched_directory {  
  my ($directory_to_check, $project_name) = @_;
  my @problem_files;
  my @content_to_process;
  my $this_file;

  message_log ("\n\nCalled Function: get_files_in_watched_directory", 
+"low");
  message_log ("\n  Checking $directory_to_check", "low");

  chdir ($directory_to_check) || die "cannot chdir to directory $direc
+tory_to_check: $!";
 
  # Get everything - files and directories - and decide what to do wit
+h each one
  opendir(DIR,$directory_to_check) || die "cannot open directory $dire
+ctory_to_check: $!";
  while (defined($this_file = readdir(DIR))) {  
    my $full_file_name = $directory_to_check . '/' . $this_file;

     # ignore the special directories "." and '.."
    if ($this_file =~ /^\.\.?$/) {
      next 
    }

    elsif ($project_name eq 'topics') {
      if (is_valid_topic_directory($full_file_name)) {
        push (@content_to_process, $this_file)
      }
    }

    elsif ($project_name eq 'resources') {
      if (is_valid_resources_directory($full_file_name)) {
        push (@content_to_process, $this_file)
      }
    }

    # wite anything else (file or directory) to an array to be process
+ed later
    else {
      message_log ("\n    found: $this_file", "normal");
      push (@content_to_process, $this_file)
    }
  } 
  
  return @content_to_process;
}



######################################################################
+##################################
sub is_valid_topic_directory {  
  # Check is this is valid topic content.
  # To be valid the content must:
  # - be a directory
  # - be a recognised directory name
  # - contain a complete file to show that the export from documentum 
+has completed
  my ($content_dir) = shift;
  message_log ("\nCalled function: is_valid_topic_directory", "normal"
+);

  # must be a directory and contain a complete file
  if (-d $content_dir && -e $content_dir . '/' . 'complete') {
    
    # must be valid name for topic cntent
    # this can include topic, training module, pubdb and orgdb content
    if (basename($content_dir) =~ /^mini-[^-]+?-topics-\d+$/ ||
        basename($content_dir) =~ /^mini-(:?orgdb|pubdb)-smart-databas
+e+$/) {
      message_log ("\n apg 001: basename($content_dir)", "normal");
      return 1;
    }
  }
  
  # fail by default
  return 0;
}


############################################
sub is_valid_resources_directory {  
  # Check is this is valid news and resources content.
  # To be valid the content must:
  # - be a directory
  # - be a recognised directory name
  # - contain a complete file to show that the export from documentum 
+has completed
  my ($content_dir) = shift;
  message_log ("\nCalled function: is_valid_resources_directory", "nor
+mal");

  message_log ("\n apg 001: basename($content_dir)", "normal");
  # must be a directory and contain a complete file
  if (-d $content_dir && -e $content_dir . '/' . 'complete') {
    
    # must be valid name for news and resources cntent
    # this can include full or delta exports, training module, pubdb a
+nd orgdb content
    # for example: recent-briefings,   recent-news
    #              all-gold-briefings, all-gold-news
    if (basename($content_dir) =~ /^(:?recent|all-gold)-(:?briefings|c
+asereports|features|legtrackers|news|questions|wnupdates)$/) {
      message_log ("\n apg 002: basename($content_dir)", "normal");
      return 1;
    }
  }
  
  # fail by default
  return 0;
}

######################################################################
+#################################
sub message_log {
  # write a message to the screen and the log
  # what gets written depends on the script debug mode and message imp
+ortance 
  # -message importance: low, normal, high
  # -script mode: normal, silent, debug
  #What to write and when:
  # - silent mode : write message only if it's high importance
  # - normal mode : write message if it's high or normal importance
  # - debug mode  : write everything
  # tis needs some work.....
  my ($message, $importance) = @_;


  #Important information
  #- always print to stdout
  #- print to the log is it's open
  if ($importance eq lc"high") {

    print "$message";    
    if (defined $log_fh) {
      print $log_fh "$message";
    }
  }
  
  #Debugging information
  #- never print to stdout
  #- print to the log if it's open and the script is in debug mode    
  elsif ($importance eq lc"low") {
    if (defined $log_fh) {
      if (defined $debug_mode && $debug_mode eq lc"debug") {
        print $log_fh "$message";
      }
    }
  }
  # Anything else
  # -print to the log if it's open 
  # -otherwise print to stdout because the meassage will otherwise be 
+lost
  else {
    if (defined $log_fh) {
      print $log_fh "$message";
    }
    else {
      print "$message";    
    }
  }
}



###########################################
#
# END 
#
###########################################
# return true
1;
[download]

Comment on inconistent test results when using is_deeply to test an array returned from a function Select or Download Code

Replies are listed 'Best First'.
Re: inconistent test results when using is_deeply to test an array returned from a function by kennethk (Abbot) on Sep 24, 2014 at 16:29 UTC
I note that in `get_files_in_watched_directory` you have the line `opendir(DIR,$directory_to_check) \|\| die "cannot open directory $direct +ory_to_check: $!";` [download] without a trailing closedir. What happens when you add that, or swap to an indirect/lexical directory handle, like `opendir(my $dir,$directory_to_check) \|\| die "cannot open directory $di +rectory_to_check: $!"; while (defined($this_file = readdir($dir))) {` [download] I'm just guessing you might have trouble because of the global handle. #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.	[reply] [d/l] [select]
Re^2: inconistent test results when using is_deeply to test an array returned from a function by APGRMF (Novice) on Sep 25, 2014 at 10:00 UTC
I've reduced the code a bit further (below) and ran the following two scenarios First, I increased the number of temporary files being created in the test from 2 to 5. The function, which is otherwise unchanged, now fails every time. A print to the screen of the two arrays shows that they have the same content but in a different order. In the second scenario, I amended the function so that the "expected" array is now passed in as a reference and then returned unchanged. The test which compares the expected array with the actual array then passes every time as expected. A print to the screen of the two arrays shows that they have the same content in the same order as would be expected. It appears as though is_deeply is failing the test in the first scenario because of the array order. Any thoughts? Thanks Amended test code note "\nTest function: get_files_in_watched_directory"; subtest 'get_files_in_watched_directory' => sub { my @got_files; my @expected_files; my $project_name = 'default'; # create a temporary directory my $temp_dir = tempdir( CLEANUP => 1 ); # create some temporary files in the directory my ($temp_file1_handle, $temp_file1) = tempfile(DIR => $temp_dir); my ($temp_file2_handle, $temp_file2) = tempfile(DIR => $temp_dir); my ($temp_file3_handle, $temp_file3) = tempfile(DIR => $temp_dir); my ($temp_file4_handle, $temp_file4) = tempfile(DIR => $temp_dir); my ($temp_file5_handle, $temp_file5) = tempfile(DIR => $temp_dir); # create an array populated with the names of the temporary files th +at we expectd to see push (@expected_files, basename($temp_file1)); push (@expected_files, basename($temp_file2)); push (@expected_files, basename($temp_file3)); push (@expected_files, basename($temp_file4)); push (@expected_files, basename($temp_file5)); # now pass the temporary directory to the the function # the function should return an array @got_files with the names of t +he files in the directory @got_files = get_files_in_watched_directory($temp_dir, $project_name +, \@expected_files); print "\n\nGOT\n"; print Dumper(@got_files); print "\n"; print "\n\nEXPECTED\n"; print Dumper(@expected_files); print "\n"; #compare the two arrays is_deeply(\@got_files, \@expected_files, 'Function should return a l +ist of directory contents that will be processed'); }; [download] Amemded script under test ###################################################################### +################################## sub get_files_in_watched_directory { my ($directory_to_check, $project_name, $original_content_ref) = @_; my @problem_files; my @content_to_process; my $this_file; print "\n\nCalled Function: get_files_in_watched_directory"; print "\n Checking $directory_to_check"; chdir ($directory_to_check) \|\| die "cannot chdir to directory $direc +tory_to_check: $!"; # Get everything - files and directories - and decide what to do wit +h each one opendir(DIR,$directory_to_check) \|\| die "cannot open directory $dire +ctory_to_check: $!"; while (defined($this_file = readdir(DIR))) { my $full_file_name = $directory_to_check . '/' . $this_file; # ignore the special directories "." and '.." if ($this_file =~ /^\.\.?$/) { next } # wite anything else (file or directory) to an array to be process +ed later else { print "\n found: $this_file"; push (@content_to_process, $this_file) } } closedir(DIR); print "\n\nabout to retun"; #return @content_to_process; return (@{$original_content_ref}); } [download]	[reply] [d/l] [select]
Re^3: inconistent test results when using is_deeply to test an array returned from a function by APGRMF (Novice) on Sep 25, 2014 at 10:21 UTC
Having read/investigated a bit more, I think my original question/title was misleading - I think the issue comes down to: how do I compare two arrays for equality. Note: the equality test needs to sort the arrays first or ignore array order	[reply]
Re^4: inconistent test results when using is_deeply to test an array returned from a function by Anonymous Monk on Sep 25, 2014 at 10:54 UTC
Re^5: inconistent test results when using is_deeply to test an array returned from a function by APGRMF (Novice) on Sep 25, 2014 at 11:19 UTC
Some notes below your chosen depth have not been shown here
Re^2: inconistent test results when using is_deeply to test an array returned from a function by APGRMF (Novice) on Sep 25, 2014 at 07:38 UTC
Thanks. Good spot. I've applied a closedir as suggested. However, first indications are that the test is still returning inconsistent results.	[reply]