The DBI docs say that fetchall_arrayref can get records in batches, as a compromise between fetching one-by-one and all-at-once.

The example from the manual goes like this.

my $rows = []; # cache for batches of rows while( my $row = ( shift(@$rows) || # get row from cache, or reload +cache: shift(@{$rows=$sth->fetchall_arrayref(undef,10_00 +0)||[]}) ) ) { ... }

This idiom makes sense, because it can fetch records at high speed, as shown in Speeding up the DBI.

A few days ago, brother bradcathey asked my help for a case that was similar to the idiom in the manual, with the addition of a slice, so that fetchall_arrayref produces a AoH instead of a AoA. This case led to an error, as shown in the following code.

#!/usr/bin/perl use strict; use warnings; use Data::Dumper; use DBI; my $driver = shift || 'mysql'; my $db = 'test'; my $user = undef; my $pass = undef; my $dbh; $driver = lc $driver; if ($driver eq 'mysql' ) { $dbh = DBI->connect('DBI:mysql:'.$db . ";mysql_read_default_file=$ENV{HOME}/.my.cnf" , $user, $pass, {RaiseError => 1}) or die "Can't connect: $DBI::errstr\n"; printf "DBI: %s - DBD::mysql: %s\n", $DBI::VERSION, $DBD::mysql::V +ERSION; } elsif ($driver eq 'sqlite') { $db = 'test.db'; if ( -f $db) { unlink $db; } $dbh = DBI->connect('DBI:SQLite:'.$db, $user, $pass, {RaiseError => + 1}) or die "Can't connect: $DBI::errstr\n"; printf "DBI: %s - DBD::SQLite: %s\n", $DBI::VERSION, $DBD::SQLite:: +VERSION; } else { die "driver $driver not supported in this test\n"; } my $max_fields = 3; my $field_size = 9; my $max_records = 10; my $max_commit = 5; my $text = 'abcdefghi'; my $inserted = 0; my $create_query = qq{CREATE TABLE testdbi ( } . (join q{,}, map { qq{id$_ char($field_size) not null} } ( 1..$ma +x_fields )) . qq{, primary key (id1))}; { local $dbh->{PrintError} = 0; eval { $dbh->do(qq{DROP TABLE testdbi}) }; } $dbh->do(qq{begin}); $dbh->do($create_query); my $sth = $dbh->prepare(qq{INSERT INTO testdbi VALUES (} . (join q{,}, map {q{?}} (1..$max_fields)) .q{)} ); for (1..$max_records) { $inserted += $sth->execute( map {$text++} (1..$max_fields) ); if (($inserted % $max_commit) == 0) { $dbh->do(qq{commit}); print "$inserted\n"; $dbh->do(qq{begin}); } } $dbh->do(qq{commit}); print "inserted $inserted records\n"; my $max_rows = 3; #must leave a remainder for the error #multiples of 2 and 5 work print "without slice\n"; my $rowcache = []; my $select_query = qq{SELECT id1, id2 FROM testdbi ORDER BY id1}; $sth = $dbh->prepare($select_query); $sth->execute; my $count = 0; while (my $aref = ( shift(@{ $rowcache } ) || shift (@{$rowcache=$sth->fetchall_arrayref(undef, $max_ro +ws) || [] } ) ) ) { my $rec = Data::Dumper->Dump([$aref],['rec']); $rec =~ s/\s+/ /g; $count++; print "$count $rec \n"; }; #---- HERE STARTS THE FAILING CODE ------ print "with slice\n"; $rowcache = []; $sth = $dbh->prepare($select_query); $sth->execute; $count = 0; while (my $aref = ( shift(@{ $rowcache }) || shift (@{$rowcache=$sth->fetchall_arrayref({}, $max_rows) || [] } ) ) ) { my $rec = Data::Dumper->Dump([$aref],['rec']); $rec =~ s/\s+/ /g; $count++; print "$count $rec \n"; }; __END__ $ perl test_dbi.pl mysql DBI: 1.49 - DBD::mysql: 3.0002 5 10 inserted 10 records without slice 1 $rec = [ 'abcdefghi', 'abcdefghj' ]; 2 $rec = [ 'abcdefghl', 'abcdefghm' ]; 3 $rec = [ 'abcdefgho', 'abcdefghp' ]; 4 $rec = [ 'abcdefghr', 'abcdefghs' ]; 5 $rec = [ 'abcdefghu', 'abcdefghv' ]; 6 $rec = [ 'abcdefghx', 'abcdefghy' ]; 7 $rec = [ 'abcdefgia', 'abcdefgib' ]; 8 $rec = [ 'abcdefgid', 'abcdefgie' ]; 9 $rec = [ 'abcdefgig', 'abcdefgih' ]; 10 $rec = [ 'abcdefgij', 'abcdefgik' ]; with slice 1 $rec = { 'id1' => 'abcdefghi', 'id2' => 'abcdefghj' }; 2 $rec = { 'id1' => 'abcdefghl', 'id2' => 'abcdefghm' }; 3 $rec = { 'id1' => 'abcdefgho', 'id2' => 'abcdefghp' }; 4 $rec = { 'id1' => 'abcdefghr', 'id2' => 'abcdefghs' }; 5 $rec = { 'id1' => 'abcdefghu', 'id2' => 'abcdefghv' }; 6 $rec = { 'id1' => 'abcdefghx', 'id2' => 'abcdefghy' }; 7 $rec = { 'id1' => 'abcdefgia', 'id2' => 'abcdefgib' }; 8 $rec = { 'id1' => 'abcdefgid', 'id2' => 'abcdefgie' }; 9 $rec = { 'id1' => 'abcdefgig', 'id2' => 'abcdefgih' }; 10 $rec = { 'id1' => 'abcdefgij', 'id2' => 'abcdefgik' }; DBD::mysql::st fetchall_arrayref failed: fetch() without execute() at +test_dbi.pl line 106. DBD::mysql::st fetchall_arrayref failed: fetch() without execute() at +test_dbi.pl line 106. $ perl test_dbi.pl sqlite DBI: 1.49 - DBD::SQLite: 1.09 5 10 inserted 10 records without slice 1 $rec = [ 'abcdefghi', 'abcdefghj' ]; 2 $rec = [ 'abcdefghl', 'abcdefghm' ]; 3 $rec = [ 'abcdefgho', 'abcdefghp' ]; 4 $rec = [ 'abcdefghr', 'abcdefghs' ]; 5 $rec = [ 'abcdefghu', 'abcdefghv' ]; 6 $rec = [ 'abcdefghx', 'abcdefghy' ]; 7 $rec = [ 'abcdefgia', 'abcdefgib' ]; 8 $rec = [ 'abcdefgid', 'abcdefgie' ]; 9 $rec = [ 'abcdefgig', 'abcdefgih' ]; 10 $rec = [ 'abcdefgij', 'abcdefgik' ]; with slice 1 $rec = { 'id1' => 'abcdefghi', 'id2' => 'abcdefghj' }; 2 $rec = { 'id1' => 'abcdefghl', 'id2' => 'abcdefghm' }; 3 $rec = { 'id1' => 'abcdefgho', 'id2' => 'abcdefghp' }; 4 $rec = { 'id1' => 'abcdefghr', 'id2' => 'abcdefghs' }; 5 $rec = { 'id1' => 'abcdefghu', 'id2' => 'abcdefghv' }; 6 $rec = { 'id1' => 'abcdefghx', 'id2' => 'abcdefghy' }; 7 $rec = { 'id1' => 'abcdefgia', 'id2' => 'abcdefgib' }; 8 $rec = { 'id1' => 'abcdefgid', 'id2' => 'abcdefgie' }; 9 $rec = { 'id1' => 'abcdefgig', 'id2' => 'abcdefgih' }; 10 $rec = { 'id1' => 'abcdefgij', 'id2' => 'abcdefgik' };

As you can see from the sample output, the example fails when using both a slice and a row count with DBD::mysql, but it does not fail with DBD::SQLite. However, changing $max_records to 1,000, the script fails with SQLite2 (segmentation fault).

While initially I was thinking that perhaps there was a bug in DBD::mysql, this latter fact made me think that perhaps using fetchall_arrayref with a slice and row count together was never an intended behavior. Why? Because the row count device is intended for speed, while the slice (leading to a hashref per record) is intended for clarity, but has a speed penalty that does not agree with the row count option.

Thus, I advised bradcathey to use the recipe described at DBI recipes /binding a hash, i.e. using bind_col with a static hash. This way, there can be clarity without suffering too much speed penalty.

So, what do you think? It is a bug, or wrong usage?

 _  _ _  _  
(_|| | |(_|><
 _|   

In reply to fetchall_arrayref with slice and row count by gmax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.