Re: Making my Perl program a bit faster
by moritz (Cardinal) on Jul 08, 2009 at 12:37 UTC
|
The obvious advice is "profile your program, find hotspots, optimize those". Devel::NYTProf is a great profiler, I can really recommend it.
My second advice is to try to let the database do as much work for you as possible.
If your program processes one database entry at a time, you could also try to parallelize it.
Update: There's also a document in the newest (not yet released) perl verions about performance, pod/perlperf.pod in perl.git. | [reply] |
|
|
| [reply] |
Re: Making my Perl program a bit faster
by marto (Cardinal) on Jul 08, 2009 at 12:32 UTC
|
| [reply] |
|
|
Not yet, but I will look into it.
| [reply] |
Re: Making my Perl program a bit faster
by jethro (Monsignor) on Jul 08, 2009 at 12:49 UTC
|
Your question is not only general it is too general. For example a regex can be fast but if you introduce lots of backtracking into it, it can be very slow.
There is a chapter in the official perl book ("Programming Perl" by Larry Wall...) on efficiency (page 537 in my edition), which might have some useful tips for you
I would suggest profiling your program. The "recommended by perlmonks" module of the moment seems to be Devel::NYTProf.
| [reply] |
Re: Making my Perl program a bit faster
by graff (Chancellor) on Jul 08, 2009 at 16:31 UTC
|
You said:
I ... am calling the DB about 5 times per run...
Does "one run" mean one command-line execution of the script? How many keys do you generate in one run?
If you are doing lots of runs, part of the slowness will be at the level of the shell, having to build up and tear down a process for each run. Ideally, one run (lasting up to 8 hours or whatever) should minimize this sort of processing overhead.
Apart from that, it's probably more a question of algorithm, and you haven't given us any clues on this. How complicated is the procedure to come up with "normalized keys"? How big does the script really need to be to do this?
And how often do you have to come up with "normalized keys" for a million entries? (Does this need to be done repeatedly? If not, just run it for 8 hours and be done with it -- why worry about timing?) | [reply] |
|
|
Does "one run" mean one command-line execution of the script?
Each time I create a new "key" for a value is a run (perhaps I should have used a different word). Therefore I am doing over a million runs.
Apart from that, it's probably more a question of algorithm
Like I said, it's a fairly complex program that uses several different modules. Trying to explain the algorithm is pretty complicated. My intent was to get some clues and tips on how to run the program more efficiently, which I did, and will hopefully get some more.
And how often do you have to come up with "normalized keys" for a million entries?
Once per customer, since we're doing a version upgrade, and the "normalized keys" is a new feature. We have quite a few customers, so its fairly important.
| [reply] |
|
|
Each time I create a new "key" for a value is a run (perhaps I should have used a different word). Therefore I am doing over a million runs.
It's not clear whether you answered my question. How many times does the shell load and execute your multi-module perl script? Once, or over a million times?
If the latter, then I would strongly suggest that you refactor things so that you can generate a large quantity of keys in a single shell-command-line run -- running millions of processes in sequence (each one a presumably bulky script with a database connection and 5 queries) is bound to be costing you a lot of time. By doing a large quantity of keys in a single process, you might save a lot on DB connection, statement preparation, etc, in addition to OS process-management overhead.
And/or maybe you can speed things up a bit by running multiple instances in parallel? (But then you have to make sure they don't interfere with each other, or overlap, causing redundant runs.)
Once per customer, since we're doing a version upgrade, and the "normalized keys" is a new feature. We have quite a few customers...
And I suppose each customer has their own particular data requiring their own distinct set of normalized keys? Obviously, any amount of commonality across customers should be exploited (keep results from one to use on another, if at all possible).
| [reply] |
|
|
| [reply] |
|
|
Re: Making my Perl program a bit faster
by salva (Canon) on Jul 08, 2009 at 23:20 UTC
|
Generic questions just get generic answers! If you want to get useful help, post your code or at least a description of the process you follow to normalize the keys.
Without context it is difficult to say, but 40 operations/seconds doesn't look very impressive unless you are performing quite complex operations.
Profilers (as Devel::NYTProf) can be very helpful for finding hot spots in your program, but they tend to make you focus on small scopes that only will give you relatively small speed increases (typically < 30%).
If you want to improve the speed of your program by orders of magnitude your first action should be to examine the algorithms you are using and to try to replace them with better ones.
| [reply] |
Re: Making my Perl program a bit faster
by JavaFan (Canon) on Jul 08, 2009 at 12:55 UTC
|
What parts of Perl are known to be a bit slower or less efficient than others? And what parts of Perl are super fast and should be used more?
What a silly question to ask. One doesn't use parts of Perl because they are fast, one uses parts of Perl that solve their problem. It's like taking the train: the most important factor in deciding which train to take isn't the speed of the train, but whether it brings me to where I want to be.
BTW, Perl doesn't have things that are "not fast" for the sake of being "not fast". They may be "not fast" because they do a lot of stuff. It would be silly to avoid them if the stuff they do is what you want to be done.
| [reply] |
|
|
....um...er.... not quite or, at least, not always:
s/(the most important factor in deciding which train to take isn't the speed of the train, but whether it brings me to where I want to be)/\1 when I want to get there./
Sometimes, the [bicycle|car|plane] is the right choice (cf Moritz advice re letting the database do some of the work).
Update: AnomalousMonk (thanks!) notes: s/(...)/$1 when I want to get there./ vice s/(...)/\1 when I want to get there./ : capture variable is preferable to backreference in string interpolation (backreference generates warning).
| [reply] [d/l] [select] |
|
|
Hi JavaFan,
First of all, if one's workplace uses Perl, then one would use Perl all the time, and indeed try to program it as efficiently as possible.
Also, I am not a huge expert in this (the reason why I am asking this question), but like any programming language, there are things Perl is very good at (e.g. regexes and parsing files) and not so good at (hardcore mathematical computing if I'm not mistaken).
The purpose of my question was to get a bit more info about this issue, and to gather a few tips (e.g. use a profiler).
I think the issue of Perl and efficiency is an interesting one (in fact it might be a good idea for my next Perl Mongers lecture after I understand it a bit more), and not so silly as you might think.
Cheers
mrguy123
| [reply] |
|
|
| [reply] |
|
|
|
|
|