Summary
I wanted to run a Test::More script that would ultimately execute over 10_000_000 tests. It died, however, after about 8_000_000 because it ran out of memory. After some investigation, I found that Test::Builder retains a record for every test run, and this is likely why my test died.
In this meditation I look at a few solutions to this problem.
Background (What I was really trying to do.)
At $work, we have a multi-terabyte NFS mounted storage pool with millions of files, each with a record in the application's database. I wrote a few audit tools to confirm (1) that each file in storage has a record in the database, (2) that each record in the database has a file in storage, and (3) that the md5 has for the file in storage matches the one in the database. It also does some other sanity checking.
I thought it would be a good idea to (ab)use standard testing tools to write this. It could output TAP and run under Test::Harness. It would be easier to automate a "quick" day-long sanity check.
From the perspective of the testing framework, there are multiple tests per file. Each test verifies the correctness of some property of the files and their relationship to the database.
Planning a lot of tests (The opening of hostilities.)
This is actually pretty easy. I open the database and ask it how many files there are supposed to be. Then I use that for my plan.
use Test::More; use File::Find; use DBI; my $dbh = DBI->connect( ... ); my ($file_count) = $dbh->selectrow_array( 'SELECT count(*) FROM t' ); plan 'tests' => $tests_per_file * $file_count; find({ wanted => \&verify, follow_fast => 1 }, $storage_dir ); diag( "It's normal to run more tests than planned because files have b +een created since the records were counted" );
Method 1: Change Test::Builder (Plead for mercy.)
I filed a change request, but my expectations are pretty low. Having looked into the code a little, I think this change is easier said than done.
Method 2: Use the disk. (tie to DBM::Deep.)
I didn't actually try this, but I'm pretty sure it would work.
# before testing. my $results_db = 'test_results.db'; if ( ! unlink $results_db && -e $results_db ) { die "Can't unlink existing results db '$results_db': $!"; } my $db = tie my @test_results, 'DBM::Deep', 'test_results.db'; Test::More->builder->{Test_Results} = \@test_results;
This should cause the test results to go to the test_results.db file on disk instead of hogging memory. When testing is over, you'll want to unlink that file.
The elements of Test::More->builder->{Test_Results} are hash references, so my first choice of Tie::File wouldn't work.
Method 3: Delete test results (Lie to the framework.)
Out of millions of tests, I expect maybe a few hundred fails. All the successes are more or less the same to me. So maybe I can make an array where every success is the same success. Let there be only one success and let every subsequent success be merely a reference to that one.
package Tie::StdArray::TestResults; use Tie::Array; @Tie::StdArray::TestResults::ISA = ( 'Tie::StdArray' ); use List::Util qw( first ); sub default_STORE { $_[0]->[$_[1]] = $_[2] } sub STORE { my ( $self, $index, $val ) = @_; return &default_STORE if ref $val ne ref {}; return &default_STORE if ! $val->{ok}; my $first_ok = first { ref $_ eq ref {} and $_->{ok} } @{ $self }; return &default_STORE if ! $first_ok; return $self->default_STORE( $index, $first_ok ); } package main; use Test::More; tie my @test_results, 'Tie::StdArray::TestResults'; Test::More->builder->{Test_Results} = \@test_results;
Careful application of Data::Dumper shows an array with one hash ref and other elements that reference the same hash. This gives me confidence that the DBM::Deep method would work also, even though I haven't tried it.
Conclusion
It can hardly be denied, tie can cure and cause a multitude of sins.
In reply to More tests than you shake a memory stick at by kyle
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |