Lori713 has asked for the wisdom of the Perl Monks concerning the following question:

Howdy,

I am seeing some issues when trying to render a web page that contains a large table of data. I chatted a bit yesterday evening about this in the chat box and did some checking to see where the most time is spent.

It appears the time hog is occuring when processing my foreach loop. I have simplified my code and it is below.

In summary, I have three pieces of SQL I run (see Lori713's scratchpad), and the total rows returned for all three is approx. 75,000. On smaller project numbers (say 500 rows), once I get through my foreach loop, the three datasets are nicely smooshed together to form one row for each project number and works beautifully if a bit more slowly than I would like:

project ID        Bud amount       Act amount        Enc amount

I noticed that it is taking a very long time to process each row. Could this be because of the sort in the foreach? Does putting
 foreach $key ( sort keys %ctrl_tbl )
cause it to sort each time through the control table, or is the cost for the sort observed only the first time through?
#!/usr/local/bin/perl5_8 use strict; use ncw_com_library; # contains common subs (commify, timeo +ut, etc.) use HTML::Template; use Time::Local; use DBI; use CGI ':standard'; my $CGI = CGI->new; # Clear buffers and set up web page (required) $|=1; open STDERR, ">&STDOUT"; # SNIP there are ~25,000 unique keys in my control table and I have +snipped # out the SQL and database calls that generate my control table datase +t for # those since they are returning results in a timely fashion based on # timestamps printed while fetching and executing SQL. my ($key, %col_cbud, %col_encu, %col_fytd, %col_proj, %ctrl_tbl, @loop +_data); foreach $key ( sort keys %ctrl_tbl ) { my %row_data; # Gimme a timestamp so I can see how long each row takes my ( $sec1,$min1,$hour1,$mday1,$mon1,$year1,$wday1,$yday1,$isdst1 ) = +localtime; print "during foreach loop hour min sec $hour1 $min1 $sec1 xxx key is +$key <br>"; $row_data{col_proj} = $key; $row_data{col_cbud} = $col_cbud{$key}; $row_data{col_fytd} = $col_fytd{$key}; $row_data{col_encu} = $col_encu{$key}; # Add each hash row to loop for template push(@loop_data, \%row_data); } # Pass parameters and variable values from @loop arrays to template; p +rint report $template->param( passdata => \@loop_data, ); print $template->output();
So far, as I try to run the report, my timestamp prints look like this (the first 100 rows):
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxbefore foreach loop hour min sec +11 13 44 during foreach loop hour min sec 11 13 45 xxx key is 649001 during foreach loop hour min sec 11 14 43 xxx key is 649001-60426-F SNIP during foreach loop hour min sec 12 3 44 xxx key is 660062-05279 during foreach loop hour min sec 12 4 19 xxx key is 660062-15279
We are beginning to see more and more issues with pages timing out as our co-workers ask for more sophisticated programs (I work at a public university). I think we need to start looking into alternative ways to address this issue and come up with a coding method to incorporate into the applications we're writing.

I would appreciate any suggestions or ideas as to what might be causing such a large block of time being spent on each row, and for any ideas about future coding that can take into account rendering large data tables on the browser.

Thanks!
  • Comment on Foreach loop takes a long time to process and my report times out before page is rendered
  • Select or Download Code

Replies are listed 'Best First'.
Re: Foreach loop takes a long time to process and my report times out before page is rendered
by GrandFather (Saint) on May 19, 2011 at 01:41 UTC

    As shown in your OP the for loop does nothing that takes any time at all (except maybe the diagnostic print). It is code you haven't shown us that is taking the time.

    Don't hide code or information on your scratchpad that is pertinent to your question. That stuff won't be there in a couple of days so and your question becomes worthless to future perusers of SOPW. Maybe you don't care about that so long as you get the answer you need, but I suspect giving a little back to the community is something you'd be happy to do so please ensure all pertinent information is available in the OP (inside readmore tags if it's large).

    True laziness is hard work
Re: Foreach loop takes a long time to process and my report times out before page is rendered
by Anonymous Monk on May 18, 2011 at 17:20 UTC
    Here are some thoughts I think apply.

    One problem, you're using templating which requires that you build a very big data structure in memory, just so you can print it out.

    The solution is to NOT build a giant data structure in memory, and print each row right after you retrieve it.

    All of these variable seem redundant

    my ($key, %col_cbud, %col_encu, %col_fytd, %col_proj, %ctrl_tbl, @loop +_data);
    and I can tell without even seeing your template.

    You might say but I need all those to smoosh the datasets together.

    Well, thats what views are for. If you under utilize your database, you end up replicating a lot of its functionality. Databases are usually very good (well, good enough, depends on the data) at smooshing a bunch of datasets together.

    In a recent thread, before I remembered about a special view called a pivot table, I ended up replicating the feature in perl, see Re: Open multiple file handles?. It took a while to write, a lot longer than it would to use a pivot table.

    Leverage your database, study the sql, optimize it, its cheaper than reimplementing in perl. I realize its possible you may have good reasons for doing things in perl :)

    Now the real trick to speeding up slow, long running programs, CGI or otherwise, is to refuse to run more than one (schedule only one) and cache the results.

    For a technique see Watching long processes through CGI (Aug 02)

Re: Foreach loop takes a long time to process and my report times out before page is rendered
by Neighbour (Friar) on May 19, 2011 at 07:21 UTC

    Another tip:

    • Do not use sort in foreach $key ( sort keys %ctrl_tbl ) If you want to process your query-results sorted in some order, use ORDER BY in your query and fetch the results using selectall_arrayref or fetchall_arrayref. Then loop through the resulting array.

Re: Foreach loop takes a long time to process and my report times out before page is rendered
by jacaril (Beadle) on May 18, 2011 at 16:51 UTC

    A few options mostly from what I'm seeing here.

    It looks like you are precaching the entire table to a set of hashes and then sorting them and looking up each key individually. If you saved off the data to an array you could save time on the sort call and hash lookups.

    An option I do use a bit is to utilize Ajax. Setup a back end process which creates the report and have the Ajax calls populate a div when the report is complete.