Using Devel::Profile output to know what to do next

water has asked for the wisdom of the Perl Monks concerning the following question:

I have big app that uses Class::DBI heavily. I'd like to speed it up. Before profiling, my assumption was the DB was the limiting factor. Below are some profiling results. All the big time eaters are CDBI (which means, I think, the DB itself.)

How should I intrepret these results?

For example, does this tell me that the perl stuff isn't taking any time (assuming the algorithm isn't dumb) and that to gain speedup I should pay attention to the database side, indexes and summary tables and so forth?

I wouldn't gain any speed dropping CDBI and using DBI, would I?

Is there any point optimizing the perl, when all the time seems CDBI (eg DB)?

Should I care it seems to be spending alot of time lowercasing column names Class::DBI::Column::name_lc?

Thanks for any advice.

water

%Time    Sec.     #calls   sec/call  F  name
14.67   55.4872  2174052   0.000026     <anon>:...5.8.2/Class/DBI/Colu
+mn.pm:37
 9.55   36.1474  2178574   0.000017     Class::DBI::Column::name_lc
 7.01   26.5282  2627968   0.000010     Class::Accessor::get
 6.52   24.6642   220518   0.000112  ?  <anon>:..._perl/5.8.2/Class/DB
+I.pm:723
 5.20   19.6819   264665   0.000074  ?  Class::DBI::_init
 4.97   18.7836   884369   0.000021     <anon>:...perl/5.8.2/Class/DBI
+.pm:1039
 4.51   17.0602   661572   0.000026  ?  <anon>:...rl/5.8.2/Class/Trigg
+er.pm:51
 4.41   16.6713   583607   0.000029     Class::DBI::primary_column
 3.62   13.6949   661572   0.000021     <anon>:.../DBI/Relationship/Ha
+sA.pm:38
 3.01   11.3699   583837   0.000019     Class::DBI::ColumnGrouper::pri
+mary
 2.82   10.6654   266945   0.000040     Class::DBI::_attribute_store
 2.59    9.8148   220528   0.000045     Class::DBI::meta_info
 2.55    9.6606   220524   0.000044  ?  Class::DBI::_simple_bless
 2.19    8.2928   361192   0.000023     Class::DBI::_attrs
 1.92    7.2692   630109   0.000012     Class::DBI::ColumnGrouper::gro
+up_cols
 1.39    5.2500   220524   0.000024     <anon>:.../DBI/Relationship/Ha
+sA.pm:46
 1.36    5.1261   220524   0.000023     <anon>:.../DBI/Relationship/Ha
+sA.pm:37
 1.35    5.1138   220523   0.000023     <anon>:.../DBI/Relationship/Ha
+sA.pm:70
 1.25    4.7459       21   0.225996     DBI::st::execute
 1.13    4.2896    88172   0.000049     Class::DBI::get
 1.08    4.1012   264665   0.000015     <anon>:..._perl/5.8.2/Class/DB
+I.pm:518
 1.00    3.7662   176564   0.000021     <anon>:..._perl/5.8.2/Class/DB
+I.pm:722
[download]

Comment on Using Devel::Profile output to know what to do next Select or Download Code

Replies are listed 'Best First'.
Re: Using Devel::Profile output to know what to do next by perrin (Chancellor) on Jun 08, 2005 at 01:50 UTC
What options did you use to dprofpp? You need to tell it to sort by wall time, not CPU time. Database and other I/O stuff usually takes lots of wall time and very little CPU.	[reply]
Re^2: Using Devel::Profile output to know what to do next by water (Deacon) on Jun 08, 2005 at 01:54 UTC
First time using profiling. Can you say more? All I did was `perl -d:Profile test.pl` [download] using Devel::Profile. What would be better?	[reply] [d/l]
Re^3: Using Devel::Profile output to know what to do next by graff (Chancellor) on Jun 08, 2005 at 03:25 UTC
If instead of using "Devel::Profile" (which is a non-core CPAN module), you used "Devel::DProf" (which is a core module included with every perl installation), you would be going through an extra step, after running your CDBI app, to run the separate utility "dprofpp" (also included with every perl installation), in order to see the pretty listing of summary statistics for all the sub calls. It's when you run "dprofpp" ("pretty-print") that you can use command line options to specify how to sort the entries. Anyway, you say it's a big app with CDBI all through it, but you are considering looking at a straight DBI approach to see if that will speed things up. Looking at your numbers, my initial guess is that going to straight DBI, and managing the layout of your tables with your own data structures (rather than relying on the CDBI OO stuff to do this for you) is likely to cut down noticeably on the overall runtime... (... unless of course you happen to do a really poor job of factoring out the CDBI stuff.) I've never used CDBI myself, so I don't have a feel for its relative runtime-overhead vs. programmer-efficiency trade-offs. Maybe it's the kind of thing that makes great sense as a prototyping tool, or as a production solution for jobs that don't involve really heavy loads (e.g. having fewer rows and columns involved than you do). Still, it's up to you to figure out whether you think a rewrite without CDBI is going to be worthwhile, because we don't know how complicated your app is, or what other constraints there are. If you need some empirical evidence before doing a major rewrite, then maybe a worthwhile test to try would be a very simple app that will go through the same amount of database content, with one or more queries that at least come close to what the real app is doing, but doesn't do much else -- it just has to be simple and quick to write both with and without CDBI, so you can benchmark the two approaches. (I'll bet that with queries returning 200K rows and 10 columns per row from your database, you'll see a big difference when you take away the 2 million calls to that bunch of CDBI functions.)	[reply]
Re^4: Using Devel::Profile output to know what to do next by perrin (Chancellor) on Jun 08, 2005 at 03:44 UTC
Re^3: Using Devel::Profile output to know what to do next by perrin (Chancellor) on Jun 08, 2005 at 03:50 UTC
It looks like you picked the wrong profiler. You should use either Devel::DProf or Devel::Profiler, not Devel::Profile. The problem with Devel::Profile is that it measure the CPU time, not the "real" time that has elapsed. Waiting hours for a database query might not take much CPU time at all. If you use a profiler like the two I mentioned which lets you sort by wall ("wall clock" or real time), you will see how much time was really spent waiting for your queries to execute.	[reply]
Re^4: Using Devel::Profile output to know what to do next by Anonymous Monk on Jun 18, 2008 at 19:35 UTC
Re: Using Devel::Profile output to know what to do next by hv (Prior) on Jun 08, 2005 at 08:38 UTC
I've never used Class::DBI, but a quick look shows that the code for Class::DBI::Column is very simple. It might be worth hacking it to cache the lower-case version of the classname to see if that makes a difference: `use overload '""' => sub { shift->{'name_lc'} }, fallback => 1; [...] sub new { my ($class, $name) = @_; return $class->SUPER::new( { name => $name, name_lc => lc($name), _groups => {}, placeholder => '?' } ); } sub name_lc { shift->{'name_lc'} }` [download] That should save quite a chunk from the first two lines of your profile output: `%Time Sec. #calls sec/call F name 14.67 55.4872 2174052 0.000026 <anon>:...5.8.2/Class/DBI/Colu +mn.pm:37 9.55 36.1474 2178574 0.000017 Class::DBI::Column::name_lc` [download] Be careful though if any other classes are inheriting from CDBI::Column - you'd need to check that this hack is compatible with their expectations. (I'd expect it to be ok though, as long as inheritors call their SUPER::new() to construct.) Hugo	[reply] [d/l] [select]
Re^2: Using Devel::Profile output to know what to do next by perrin (Chancellor) on Jun 08, 2005 at 19:46 UTC
Hang on, cowboy. You need to profile first to see if that method is taking up any real time or not. Your work could be totally pointless if the lowercase calls don't take a significant part of the actual wall time.	[reply]
Re^3: Using Devel::Profile output to know what to do next by hv (Prior) on Jun 09, 2005 at 10:32 UTC
Well whose two routines are taking 91 seconds of CPU time, so unless it's a multi-processor machine or a threaded application, they won't be taking less than 91 seconds of wall clock time. The numbers suggest total CPU time around 6.5 minutes; it would be useful to hear from the OP what the actual runtime is, but I'd expect my proposed hack to take more than a minute off it, replacing the 26+17 μs for 2.2 million calls with something more like the 10 μs of the accessor. But it is true - I'm used to profiling on a busy server, where useful wall clock times are hard-to-impossible to get, so mostly I concentrate on profiling to optimise CPU requirements, and use a combination of two other techniques for optimising the database side - the ever-popular "finger in the air" and "it hurts when I do this" techniques. Hugo	[reply]
Re: Using Devel::Profile output to know what to do next by samtregar (Abbot) on Jun 08, 2005 at 23:14 UTC
If you've got a good feeling that your problem is database related (which you might get from watching top while something slow happens and seeing mysqld pegging the CPU, for example) then you'll get more mileage from DBI's built-in profilers than from something that only looks at the Perl side. Check out DBI::ProfileDumper for the big guns. If your app is simple enough you could also try DBI::Profile. -sam	[reply]
Re: Using Devel::Profile output to know what to do next by jplindstrom (Monsignor) on Jun 08, 2005 at 14:01 UTC
I did the same thing you did and found out that unrolling the name_lc call bought some performance gain. The adventure is described in my journal. /J	[reply]
Re: Using Devel::Profile output to know what to do next by TedPride (Priest) on Jun 08, 2005 at 02:49 UTC
In almost all cases, the database is what needs optimizing, not your script. Remove dead records and optimize your tables often; set up indexes for large, high volume queries; and make sure that you're not using a series of inefficient queries when one query will do. Also, use the smallest field size possible for each piece of data, and stay away from variable length fields unless you really have to use them. If you do use them, always put them at the end of the record, and be sure to optimize your tables even more often to remove dead space.	[reply]
Re^2: Using Devel::Profile output to know what to do next by BrowserUk (Patriarch) on Jun 08, 2005 at 03:45 UTC
Don't guess --profile. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.	[reply] [d/l]
Re: Using Devel::Profile output to know what to do next by crusty_collins (Friar) on Jun 08, 2005 at 04:38 UTC
My two cents. I know from experience that ignoring case or changing case can take up a lot of time.	[reply]
asking the same question again, perhaps more clearly by water (Deacon) on Jun 09, 2005 at 00:57 UTC
water here again. clearly, the database dominates the time used. assuming the algorithm driving the db isn't dumb (say, pulling back rows to sum them in perl would likely be dumb, when instead the db could have done the sum quicker, or, even better, an agg table in the db could have precomputed the sum and cached it), then (and here's my question again) is it foolish to worry about the speed of the perl side? thanks all for the good insights water	[reply]