Re: More on selecting rows with DBI...
by btrott (Parson) on May 02, 2000 at 08:52 UTC
|
If your query is going to return a lot of rows, don't
use that method. Yes: you asked for *all* of the rows
(50_000 or whatever, using fetchall_arrayref), so DBI
is going to fetch all of the rows. So now you've sent
50_000 rows between MySQL and yourself, and you've got a lot
of data stored in memory. :)
Which isn't too good a thing, meaning that, when you can,
you should use the LIMIT offset,rows technique. This way,
you're only ever sending rows rows between MySQL and
your script. | [reply] |
Re: More on selecting rows with DBI...
by perlmonkey (Hermit) on May 02, 2000 at 09:07 UTC
|
I would guess that DBI would process all 50000 records
reguardless of what you do with the data. I am not posivite
about that though. The safest bet for database performance
is to always try to increase your query performance.
(i.e. if you only need 20 rows, then
only retrieve 20 rows). That way DBI will not have to do
all the extraneous processing.
In sybase there is a command "set rowcount NUMBER"
which will only retrieve the first NUMBER rows. There may
be something similar for the database you are using.
So if you know you are only going to use the first
20 rows of a 50000 row dataset, then you might be able to
limit it in your query:
my $sql =<<EOF;
set rowcount 20
SELECT * from total_huge_table
EOF
my $sth = $dbh->prepare($sql);
$sth->execute();
... more code here ...
If this kind of query restriction is available via a
perl module, I would guess that it would be mentioned
in the individual DBD module perldoc pages. I dont think
this
could be a DBI feature since it is probably really database
dependent.
I just noticed that Mysql has a way to limit in the SQL also:
SELECT * from huge_table LIMIT 1, 20This will
return the 1st through 20th rows. (untested, but that is what
the O'Reilly mysql book says: MySQL & mSQL by Yarger, Reese, & King)
Well I dont think I answered you exactly, but maybe
some of this will help. | [reply] [d/l] [select] |
|
|
Actually, I'm sorta new to MySQL...
I've used Oracle for the past years and there are a number of ways that this can be implemented on the database side (alot harder than LIMIT, BTW) by using PL/SQL. Now that I've been forced to use other DBMSs (MySQL being one of them), I'm trying to cut down to the least common denominator of SQL in order to maintain a certain degree of portability between the systems.
DBI has been great as far as portability, but I have found that my Oracle vices have prooven to be rather disappointing at times, since DBD::Oracle allows you to do fancy Oracle proprietary stuff (like selects within selects for instance)...
Oh, if only the DBMS vendors could agree on ONE PLAIN FLAVOR OF SQL it would make my life sooo much easier...
| [reply] |
|
|
Oh, if only the DBMS vendors could agree on ONE PLAIN FLAVOR OF SQL it would make my life sooo much easier...
I think this will happen about the same time Microsoft voluntarily open sources and GPLs Windows
along with porting Office to Linux.
It is too bad though. Standard SQL would make life a lot easier for us all.
In the mean time maybe you can keep all the database
specific code
in one module, and dynamically require that library
depending on which DBD you are using. So that way you
should
really be able to utilize the power of each database, but
still maintain a some-what portable app. Then you
can create a
database module for each database the app will be using.
| [reply] |
RE: More on selecting rows with DBI...
by Jonathan (Curate) on May 02, 2000 at 14:06 UTC
|
Just selecting the first n rows of a query is really a meaningless concept to a relational database. If you have to process 50,000 records the best way is use a cursor, so you access them a row at a time. Use $db_handle->prepare
and $db_handle->fetchrow_arrayref to loop through the dataset. See perldoc DBI.pm | [reply] |
Re: More on selecting rows with DBI...
by athomason (Curate) on May 02, 2000 at 13:33 UTC
|
Glancing through the cheetah book, there's three ways to get data from DBI: row methods (each time you call the method, it returns the next row), atomic methods (for returning only one row, doesn't help you), and batch methods (fetchall_arrayref(), selectall_arrayref(), like chromatic mentioned). The batch methods can indeed knock a system on its behind if the result set is big; it's certainly better to prepare and execute your SELECT, roughly make sure you can handle the results, then use fetchall_arrayref rather than selectall_arrayref. If you don't use all the data, though, this can be wasteful. If you're grabbing sequential sections of the result set (which it looked like from your question), I don't see a problem with just doing atomic fetches 20 times a pop when you need the next set.
If you're using MySQL (I haven't used any other rdbms), LIMIT is a good option, like others have said. From the manual: "mysql> select * from table LIMIT 5,10; # Retrieve rows 6-15" looks like what you need. But unless the offset parameter of LIMIT can be a bound value (anyone know?), I think you'd take a performance hit for having the DB redo the query plan each time.
A side note that bit me: if you want to use LIMIT with UPDATE, you'll need MySQL 3.23. | [reply] |
|
|
Actually, I don't think the binding of the value is that
much of an issue with MySQL. According to the Appendix in
the DBI book, MySQL doesn't actually support bound
variables--the use of placeholders is merely emulated
by DBI. Can anyone who actually has the DBI book confirm
this? :)
| [reply] |
|
|
The MySQL book (Paul DuBois) explicitly states that queries aren't cached, in the section on placeholders.
He also says that you might as well use them, because if you port to another database, you might get the benefit there. I'd add that they make your queries easier to read and they do quoting for you automatically. Save yourself some trouble.
| [reply] |
|
|
Re: More on selecting rows with DBI...
by Anonymous Monk on May 02, 2000 at 11:17 UTC
|
I think what you need is 'limit':
select fields from table where (something) limit 0,20;
That will only send the first 20 fields to the script. Then you can say "limit 20,40" for the next fields.
| [reply] |
|
|
I am afraid that you are mistaken. The syntax that you have
just provided:
select <something> from <somewhere> limit n, x
where n is 20 and x is 40
will NOT give you rows 20 through 40... if you look at
http://www.mysql.com/Manual_chapter/manual_Reference.html#SELECT
in the mysql reference you will see that it
states that the limit mechanism is:
limit offset, rows
in order to receive rows 20 through 40 the proper limit
syntax would have been:
limit 20, 20
thank you. | [reply] |