Re: Using Devel::Profile output to know what to do next
by perrin (Chancellor) on Jun 08, 2005 at 01:50 UTC
|
What options did you use to dprofpp? You need to tell it to sort by wall time, not CPU time. Database and other I/O stuff usually takes lots of wall time and very little CPU. | [reply] |
|
|
First time using profiling. Can you say more?
All I did was
perl -d:Profile test.pl
using Devel::Profile. What would be better? | [reply] [d/l] |
|
|
If instead of using "Devel::Profile" (which is a non-core CPAN module), you used "Devel::DProf" (which is a core module included with every perl installation), you would be going through an extra step, after running your CDBI app, to run the separate utility "dprofpp" (also included with every perl installation), in order to see the pretty listing of summary statistics for all the sub calls. It's when you run "dprofpp" ("pretty-print") that you can use command line options to specify how to sort the entries.
Anyway, you say it's a big app with CDBI all through it, but you are considering looking at a straight DBI approach to see if that will speed things up. Looking at your numbers, my initial guess is that going to straight DBI, and managing the layout of your tables with your own data structures (rather than relying on the CDBI OO stuff to do this for you) is likely to cut down noticeably on the overall runtime...
(... unless of course you happen to do a really poor job of factoring out the CDBI stuff.)
I've never used CDBI myself, so I don't have a feel for its relative runtime-overhead vs. programmer-efficiency trade-offs. Maybe it's the kind of thing that makes great sense as a prototyping tool, or as a production solution for jobs that don't involve really heavy loads (e.g. having fewer rows and columns involved than you do).
Still, it's up to you to figure out whether you think a rewrite without CDBI is going to be worthwhile, because we don't know how complicated your app is, or what other constraints there are.
If you need some empirical evidence before doing a major rewrite, then maybe a worthwhile test to try would be a very simple app that will go through the same amount of database content, with one or more queries that at least come close to what the real app is doing, but doesn't do much else -- it just has to be simple and quick to write both with and without CDBI, so you can benchmark the two approaches.
(I'll bet that with queries returning 200K rows and 10 columns per row from your database, you'll see a big difference when you take away the 2 million calls to that bunch of CDBI functions.)
| [reply] |
|
|
|
|
It looks like you picked the wrong profiler. You should use either Devel::DProf or Devel::Profiler, not Devel::Profile. The problem with Devel::Profile is that it measure the CPU time, not the "real" time that has elapsed. Waiting hours for a database query might not take much CPU time at all. If you use a profiler like the two I mentioned which lets you sort by wall ("wall clock" or real time), you will see how much time was really spent waiting for your queries to execute.
| [reply] |
|
|
Re: Using Devel::Profile output to know what to do next
by hv (Prior) on Jun 08, 2005 at 08:38 UTC
|
I've never used Class::DBI, but a quick look shows that the code for Class::DBI::Column is very simple. It might be worth hacking it to cache the lower-case version of the classname to see if that makes a difference:
use overload
'""' => sub { shift->{'name_lc'} },
fallback => 1;
[...]
sub new {
my ($class, $name) = @_;
return $class->SUPER::new(
{
name => $name,
name_lc => lc($name),
_groups => {},
placeholder => '?'
}
);
}
sub name_lc { shift->{'name_lc'} }
That should save quite a chunk from the first two lines of your profile output: %Time Sec. #calls sec/call F name
14.67 55.4872 2174052 0.000026 <anon>:...5.8.2/Class/DBI/Colu
+mn.pm:37
9.55 36.1474 2178574 0.000017 Class::DBI::Column::name_lc
Be careful though if any other classes are inheriting from CDBI::Column - you'd need to check that this hack is compatible with their expectations. (I'd expect it to be ok though, as long as inheritors call their SUPER::new() to construct.)
Hugo | [reply] [d/l] [select] |
|
|
Hang on, cowboy. You need to profile first to see if that method is taking up any real time or not. Your work could be totally pointless if the lowercase calls don't take a significant part of the actual wall time.
| [reply] |
|
|
Well whose two routines are taking 91 seconds of CPU time, so unless it's a multi-processor machine or a threaded application, they won't be taking less than 91 seconds of wall clock time.
The numbers suggest total CPU time around 6.5 minutes; it would be useful to hear from the OP what the actual runtime is, but I'd expect my proposed hack to take more than a minute off it, replacing the 26+17 μs for 2.2 million calls with something more like the 10 μs of the accessor.
But it is true - I'm used to profiling on a busy server, where useful wall clock times are hard-to-impossible to get, so mostly I concentrate on profiling to optimise CPU requirements, and use a combination of two other techniques for optimising the database side - the ever-popular "finger in the air" and "it hurts when I do this" techniques.
Hugo
| [reply] |
Re: Using Devel::Profile output to know what to do next
by samtregar (Abbot) on Jun 08, 2005 at 23:14 UTC
|
If you've got a good feeling that your problem is database related (which you might get from watching top while something slow happens and seeing mysqld pegging the CPU, for example) then you'll get more mileage from DBI's built-in profilers than from something that only looks at the Perl side. Check out DBI::ProfileDumper for the big guns. If your app is simple enough you could also try DBI::Profile.
-sam
| [reply] |
Re: Using Devel::Profile output to know what to do next
by jplindstrom (Monsignor) on Jun 08, 2005 at 14:01 UTC
|
| [reply] |
Re: Using Devel::Profile output to know what to do next
by TedPride (Priest) on Jun 08, 2005 at 02:49 UTC
|
In almost all cases, the database is what needs optimizing, not your script. Remove dead records and optimize your tables often; set up indexes for large, high volume queries; and make sure that you're not using a series of inefficient queries when one query will do. Also, use the smallest field size possible for each piece of data, and stay away from variable length fields unless you really have to use them. If you do use them, always put them at the end of the record, and be sure to optimize your tables even more often to remove dead space. | [reply] |
|
|
Don't guess --profile.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
| [reply] [d/l] |
Re: Using Devel::Profile output to know what to do next
by crusty_collins (Friar) on Jun 08, 2005 at 04:38 UTC
|
My two cents. I know from experience that ignoring case or changing case can take up a lot of time. | [reply] |
asking the same question again, perhaps more clearly
by water (Deacon) on Jun 09, 2005 at 00:57 UTC
|
water here again.
clearly, the database dominates the time used.
assuming the algorithm driving the db isn't dumb (say, pulling back rows to sum them in perl would likely be dumb, when instead the db could have done the sum quicker, or, even better, an agg table in the db could have precomputed the sum and cached it),
then (and here's my question again) is it foolish to worry about the speed of the perl side?
thanks all for the good insights
water | [reply] |