Hi Monks
I am writing a program that creates normalized "keys" for database values, which are better for searching and sorting (not that relevant for my question).
It is a fairly complex program, which is part of a large infrastructure of Perl modules. This specific development uses about 5 modules, and runs from the server.
My program (I think) is pretty fast. I can get normalized "keys" for about 40 values per second. This means that I can normalize a 1000 values in 25-30 seconds (depends if I need to update or add new keys).
The problem is that the table that I need to normalize is very large (more that a million entries), and therefore doing all the normalizations from scratch takes about 8 hours.
My question is, and this is more of a general question about Perl than about my specific development, is where am I losing time?
What parts of Perl are known to be a bit slower or less efficient than others? And what parts of Perl are super fast and should be used more?
I am using Perl 5.10, and am calling the DB about 5 times per run (avg DB call is about 0.0005 seconds).
Any ideas or advice will be most welcome.
Thanks
Guy Naamati
UPDATE: After using
NYTProf I found out that am dynamically creating a new instance of one of the modules I use each time I normalize. Hopefully by creating the instance in the start of the program I can maybe make my program run 10% faster. Thanks for the advice, and any other (ideas|tips) will be welcome (this has turned into a bit of an interesting discussion).
UPDATE 2: After analyzing the profiler, it seems that the actual act of normalization is the main "hotspot". When I use hard coded values instead of normalized ones the running time is about 6 times as fast. Since the normalization is being done by another program which probably can't change, there isn't too much I can do except create the normalization module just once and not many times (like I stated above).
Thanks for everybody's help!
I want to see people using Perl to glue things together creatively, not just technically but also socially
----Larry Wall
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.