in reply to Re^2: latency and round-trips
in thread DBI Queries take half an hour

Both Corion and Techcode are true, but you can choose something between...

If you cannot, for some reasons, filter the data by SQL, you can use repeatedly $sth->fetchall_arrayref($slice, $max_rows) It allows to fetch, for instance, maximally 10000 rows at the same time and to not exhaust memory on the client side, when unpredicted amount of rows has to be returned.

Replies are listed 'Best First'.
Re^4: latency and round-trips
by joec_ (Scribe) on Dec 17, 2008 at 23:55 UTC
    Ok, i have got my DBA to filter the SQL now. I have a table with 5000 rows (clobs).

    Is this the correct way to dereference an array reference? $struct=@$aref[1] or should i do $struct=@$aref->[1]

    As it currently stands my code above takes 7 minutes, with the occasional "not of type ocilobptr" warning, which i assume just means one of my fields is null, which i can check for later. Is 7 minutes something that sounds reasonable? According to the docs, fetchall_arrayref isnt necessarily faster than fetchrow_array.

    I would appreciate your opinions on this - which would you say is better for 5000 clobs? can you suggest a suitable rowcachesize for fetchrow_arrayref?

    I know i have bugged many of you but i really do appreciate your help. Thanks.

      Unluckily I have no experience with combination of DBI and Oracle, specially with clobs.

      But, naturally, I do not know the expected size of your 5000 clobs. If, for instance, 1 clob is maximally 64kB and 350MB of memory is acceptable for your Perl process, let you pull 5000 rows at once... When I implemented data replication between MySQL, Postgres and MSSQL servers, I have chosen 1000 maxrows.

      I use the result mostly in this manner:

      my $aref = $sth->fetchall_arrayref(); foreach (@$aref) { #now $_ is ref to an array holding one result row for (my $i = 0; $i < @$_; $i++) { print "$i: $_->[$i]\t"; } print "\n"; }
      If $aref in your question is the return of fetchall_arrayref, it is reference to an array, which items are references to an array too. So,

      $$aref[0] or $aref->[0] is the first item of the "upper" array, i.e. reference to the array containing first data row,

      $$aref[$rowindex][$colindex] or $aref->[$rowindex]->[$colindex] is the $colindex'th item of $rowindex'th data row.

      Ad 7 minutes...

      Primarily, you should know if it sounds reasonably :>) It depends on many factors, typically, size of the data, network latency and bandwith and speed of the SQL query (i.e. query optimization, Oracle configuration, frequency of processors, amount of memory, ...) If it is 100MB of the data, 7 min could be O.K. in my opinion.

      Note If you use fetchall_arrayref with maxrows or other method which does NOT pull all data at once, do not forget to pull all remaining data using, for instance, fetchall_arrayref repeatedly, or call $sth->finish(). I do not know Oracle internalities, but, it is very honest to say "Thank you, the server, I do not need remaining data."