I am not sure if you are set on the process you have right now (running the program in the background and then emailing it to the user) or not. But I have done some work with online reports hitting a MySQL database (not a ton, I am still a newb, so all of you more experienced folks please correct me if you feel I am wrong), and have found that a couple thousand records should not be a major problem to run on the fly and return the results to the screen (or excel, pdf, ... file). I have found that the longest part is usually rendering the information on the users screen.
A couple suggestions that may or may not help.
1. Make the SQL statement as fast a possible (you are probably thinking no S#$@). But what I mean is, perl can do a lot of the things you may be trying to do with SQL (like substr, decode(in some cases), uppercase, etc).
a) Also it may be fast to actualy break you statement into multiple statements. So, instead of joining 10 tables do multiple selects...
2. Take a look at Excel::Template(which does not appear to be on cpan at the moment) (not sure if this is an ok alternative to your tab delimited files or not...)
We are currently set up with CGI::Application HTML::Template PDF::Template and Excel::Template for our reports (with some other stuff mixed in)
OK, that is all for now. I am not sure if this is even close to an option for you. But it may help at least start some ideas as an alternate direction to go.
| [reply] |
StrebenMonch:
Thanks for the ideas.
Optimized the SQL is what i have been thinking about.
I attached the SQL here, I could only see to eliminate
the calculation of the cost where I used 'case' to check null value and use sum function again. Please correct me if you think any other ones could be done in the Perl.
After bind_columns, then use loop and array to do the summation?
select
mvmnt_data.region
,store_name
,nvl(i_a.nat_upc,i_b.nat_upc)
,nvl(i_a.long_description,i_b.long_description)
,nvl(i_a.item_size,i_b.item_size)
,nvl(i_a.item_uom,i_b.item_uom)
,vsi.vend_item_num
,salesUnits
,salesDollars
,case_size
,cost
,ps.ps_prod_subteam
from
(select wm.upc,sum(wm.mvmnt_units) salesUnits
,sum(wm.mvmnt_dollars) salesDollars
,wm.store
,wm.pos_dept
,wm.case_size
,sum(round (case when wm.case_size is null or wm.case_cost=0
then null
else (wm.case_cost/wm.case_size) * wm.mvmnt_units end, +3)) cost
,s.store_name
,s.region
from stella.mvmnt_wkly wm,vim.store@vim s
where wm.mvmnt_date between to_date ('10/01/2004', 'mm/dd/yyyy')and to_date ('10/03/2004', '/mm/dd/yyyy')
and wm.store in ('10036')
and wm.store=s.ps_bu
--and wm.upc='000000015232'
and wm.pos_dept='0110'
group by wm.case_size
,s.region
,wm.pos_dept,
wm.upc
,wm.store
,s.store_name) mvmnt_data,
vim.item_region@vim ir,
vim.item_master@vim i_a,
vim.item_master@vim i_b,
vim.reg_pos_dept_ref@vim rpdr
,vim.ps_team_ref@vim ps
,vim.vendor_store_item@vim vsi
where mvmnt_data.region=ir.region(+)
and mvmnt_data.upc=substr(ir.upc(+),-12,12)
and substr(ir.nat_upc, 12, 12)=substr(i_a.nat_upc(+),- 12,12)
and mvmnt_data.upc=substr(i_b.nat_upc(+),-12,12)
and mvmnt_data.region=rpdr.region(+)
and mvmnt_data.pos_dept=trim(leading 0 from rpdr.pos_dept (+))
and rpdr.pos_dept=ps.pos_dept
and rpdr.region=ps.region
and mvmnt_data.store=ps.ps_bu
and mvmnt_data.upc=substr(vsi.upc,-12,12)
and vsi.region=ps.region
and vsi.ps_bu=mvmnt_data.store
and vsi.v_auth_status='P'
Jewang
| [reply] |
This is where I might try to break this into 2 selects. (to the more experienced SQL people, feel free to correct me)
Query 1
===========================================================
SELECT ir.region
,ir.upc
,i_b.nat_upc
,rpdr.region
,rpdr.pos_dept
,ps.pos_dept
,ps.ps_bu
,vsi.upc
,vsi.ps_bu
,store_name
,NVL(i_a.nat_upc,i_b.nat_upc)
,NVL(i_a.long_description,i_b.long_description)
,NVL(i_a.item_size,i_b.item_size)
,NVL(i_a.item_uom,i_b.item_uom)
,vsi.vend_item_num
,ps.ps_prod_subteam
FROM vim.item_region@vim ir
,vim.item_master@vim i_a
,vim.item_master@vim i_b
,vim.reg_pos_dept_ref@vim rpdr
,vim.ps_team_ref@vim ps
,vim.vendor_store_item@vim vsi
WHERE substr(ir.nat_upc, 12, 12)=substr(i_a.nat_upc(+),-12,12)
AND rpdr.pos_dept = ps.pos_dept
AND rpdr.region = ps.region
AND vsi.region = ps.region
AND vsi.v_auth_status = 'P'
===========================================================
The first 9 columns in the select are new and will be used in the second select.
Query 2
===========================================================
SELECT mvmnt_data.region
,wm.upc
,sum(wm.mvmnt_units) AS salesUnits
,sum(wm.mvmnt_dollars) AS salesDollars
,wm.store
,wm.pos_dept
,wm.case_size
,SUM(ROUND (CASE WHEN wm.case_size IS NULL OR wm.case_cost=0 THEN NULL
ELSE (wm.case_cost/wm.case_size) * wm.mvmnt_units end, +3)) AS cost
,s.store_name
FROM stella.mvmnt_wkly wm
,vim.store@vim s
WHERE mvmnt_data.region=ir.region(+)
AND mvmnt_data.upc = ?
AND mvmnt_data.upc = ?
AND mvmnt_data.region = ?
AND mvmnt_data.pos_dept = ?
AND mvmnt_data.store = ?
AND mvmnt_data.upc = ?
AND mvmnt_data.store = ?
AND wm.mvmnt_date between to_date('10/01/2004', 'mm/dd/yyyy') and to_date('10/03/2004', '/mm/dd/yyyy')
AND wm.store in ('10036')
AND wm.store = s.ps_bu
AND wm.pos_dept = '0110'
GROUP BY wm.case_size
,s.region
,wm.pos_dept
,wm.upc
,wm.store
,s.store_name
===========================================================
So what I would do is call the first query then while looping through it you can do the proper substr on the data. And also call query 2.
I would try to optimize query 2 as much as possable by creating indexs if need be and looking at an execution plan. For example this is what I use for MySQL
deleteplan.sql
===========================================================
delete from plan_table
where statement_id = 'TEST'
/
commit
/
===========================================================
test.sql
===========================================================
@deleteplan
EXPLAIN PLAN set statement_id = 'TEST' for
SQL STATEMENT GOES HERE
/
@getplan
===========================================================
getplan.sql (there is some other information you can get from the plan_table, but this is all I need most of the time)
===========================================================
select operation, options, object_name, cost, cardinality, TO_CHAR(timestamp, 'HH:MM:SS')
from plan_table
where statement_id = 'IFXTEST'
start with id = 0
connect by prior id=parent_id and prior statement_id = statement_id
/
===========================================================
So I just put my SQL in test.sql and run it to get my execution plan. From there you can see where your problems are.
| [reply] |
Make the on-line script just log the request and return immediately, with a message like "your query is being processed - you will receive the result in your mail box."
Have another process make the database/mail processing, possibly using a queue.
| [reply] |
fglock:
Thanks for the response. But what do you mean "log the request" and "using a queue"?
This is how I get the request in the cgi script:
my $beginDate = $query->param('beginDate') ? $query->param('beginDate') : "";
my $endDate = $query->param('endDate') ? $query->param('endDate') : "" ;
my $region = $query->param('region') ? $query->param('region') : "";
my $frmdept = $query->param('dept') ? $query->param('dept') : "";
my $frmstorelist=$query->param('storelist') ? $query->param('storelist') :"";
Then I pass them to "cost_movement_query.pl":
my $command="/inetpub/dev/wangj/Vim_Stella/cost_movement_query.pl $region $storeList $frmdept $beginDate $endDate";
exec("$command");
In "Cost_movement_query.pl", I run the SQL and send the email.
How can I use the queue
thanks.
Jewang
| [reply] |
"log the request" - write the query to a file.
"using a queue" - read the queries back and process them, one at a time. You can even use "cron" to coordinate the calls to "Cost_movement_query.pl".
| [reply] |
There are alot of issues you have to deal with in your scenario, but here is one tip which you may find useful in dealing with long-cgi-processes.
#!/usr/bin/perl
use warnings;
use strict;
#Benjamin Goldberg
#The Apache process does NOT wait for the CGI process to exit, it wait
+s
#for the pipes to be closed. (Note that if the CGI is a NPH script, t
+hen
#it does waitpid() for the script to exit).
#Thus, you do not really need a child process -- you can have your CGI
+
#process transmute itself into whatever you would have the child proce
+ss
#do, by doing something like:
$| = 1; # need either this or to explicitly flush stdout, etc.
+
print "Content-type: text/plain\n\n";
print "Going to start the fork now\n";
open( STDIN, "</dev/null");
open( STDOUT, ">>/dev/null");
open( STDERR, ">>/path/to/logfile");
fork and exit;
exec "program that would be the child process";
#Apache waits for all children to close the pipes to apache. It
#does not wait for them to actually exit -- think ... how in the world
+
#could it *possibly* know that those processes were really spawned fro
+m
#the CGI process? Answer: It can't. It can only know that *somethin
+g*
#still has an open filedescriptor which points to one of the pipes tha
+t
#apache created to capture the CGI's stdout and stderr. As long as on
+e
#of these pipes is open, then apache waits.
I'm not really a human, but I play one on earth.
flash japh
| [reply] [d/l] |
Hi,
I just checked with the folks at my company,we are running IIS 6.0 (Microsoft Internet Information Services) and Windows boxes don't seem to have a
"/dev/null". :-)
thanks. you guys are all great.
Jewang
| [reply] |