in reply to Making my Perl program a bit faster

You said:
I ... am calling the DB about 5 times per run...

Does "one run" mean one command-line execution of the script? How many keys do you generate in one run?

If you are doing lots of runs, part of the slowness will be at the level of the shell, having to build up and tear down a process for each run. Ideally, one run (lasting up to 8 hours or whatever) should minimize this sort of processing overhead.

Apart from that, it's probably more a question of algorithm, and you haven't given us any clues on this. How complicated is the procedure to come up with "normalized keys"? How big does the script really need to be to do this?

And how often do you have to come up with "normalized keys" for a million entries? (Does this need to be done repeatedly? If not, just run it for 8 hours and be done with it -- why worry about timing?)

  • Comment on Re: Making my Perl program a bit faster

Replies are listed 'Best First'.
Re^2: Making my Perl program a bit faster
by mrguy123 (Hermit) on Jul 08, 2009 at 20:07 UTC
    Does "one run" mean one command-line execution of the script?

    Each time I create a new "key" for a value is a run (perhaps I should have used a different word). Therefore I am doing over a million runs.

    Apart from that, it's probably more a question of algorithm

    Like I said, it's a fairly complex program that uses several different modules. Trying to explain the algorithm is pretty complicated. My intent was to get some clues and tips on how to run the program more efficiently, which I did, and will hopefully get some more.

    And how often do you have to come up with "normalized keys" for a million entries?

    Once per customer, since we're doing a version upgrade, and the "normalized keys" is a new feature. We have quite a few customers, so its fairly important.
      Each time I create a new "key" for a value is a run (perhaps I should have used a different word). Therefore I am doing over a million runs.

      It's not clear whether you answered my question. How many times does the shell load and execute your multi-module perl script? Once, or over a million times?

      If the latter, then I would strongly suggest that you refactor things so that you can generate a large quantity of keys in a single shell-command-line run -- running millions of processes in sequence (each one a presumably bulky script with a database connection and 5 queries) is bound to be costing you a lot of time. By doing a large quantity of keys in a single process, you might save a lot on DB connection, statement preparation, etc, in addition to OS process-management overhead.

      And/or maybe you can speed things up a bit by running multiple instances in parallel? (But then you have to make sure they don't interfere with each other, or overlap, causing redundant runs.)

      Once per customer, since we're doing a version upgrade, and the "normalized keys" is a new feature. We have quite a few customers...

      And I suppose each customer has their own particular data requiring their own distinct set of normalized keys? Obviously, any amount of commonality across customers should be exploited (keep results from one to use on another, if at all possible).

      Does that mean you start a Perl program for each key?

      If yes: don't. Do them all in one program; that way you avoid the cost of starting up the interpreter a few million times.

        No. I call the "new" function once, and then call another function (e.g. create keys) each time I create a new key.