Hello,
I have a performance tuning problem. I've written a program that appears to run correctly, but very slowly. I'd like to profile where all the time is being spent, but I'm unsure how to do this type of profiling for this type of script. The basic processing loop is:
- select two elements at random from a 70k item array
- check that this pair has not yet been processed
- query a DB (via DBI) to get a list of properties for each of the two random elements
- next() if either of the elements has 0 properties
- foreach property-pair between the two elements, do a hash look-up to get a "property-pair score"
- find (and keep) the min property-pair score
- write the results to a file
- loop until we have found a certain number of element-pairs with min(property-pair score) above some threshold
I'll note that estimates suggest the program will need to do some 100 million iterations to hit our stopping criterion, and I'm okay with that taking several weeks of CPU.
There are some obvious avenues for tuning this. For example, I could do an initial screen on my 70k element array to remove elements lacking property-pairs. I could also slurp the entire set of property-lists into memory (not sure it will fit, but I could consider). But before doing anything I really wanted to find out where the code is slow.
What I'd really like to do is to let it run a small number of iterations, say 10k, and to profile across all those iterations how many time was spent in each of the points I listed above. Is there some way to do this kind of thing? Any other profiling tricks/advice I might benefit from?
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.