in reply to Perl concurrent access

A bit more explanation... The script pulls a number of rows from the database, it loops through this data in a while loop. Within this while loop, using the data, it adds information to a PDF page, created using PDF::API2, at the end it closes the database connection and saves the PDF. The time consuming bit is actually creating the PDF. My thoughts at the moment, following on from suggestions here, are that the script is hogging the database connection, not allowing another instance of the script to access the database until the connection is closed by the other script. Secondly, perhaps however PDF:API2 works requires some other resource that is being locked by the system, and so each script has to wait. Or is the PDF creation simply so CPU intensive that it has no choice but to take more time.

Replies are listed 'Best First'.
Re^2: Perl concurrent access
by leocharre (Priest) on Sep 10, 2009 at 14:58 UTC
    My friend, working with pdf docs is time consuming. There's no way around this. Get more hardware or make peace with it.

    Here.. try this.. As was suggested you can check the machine and what's up with # ps, so..

    Open a cli, and use the # top command. This shows you all the pids in the machine. Now run your program... And you'll see them running.. and you'll see what the memory consumption is. And you'll see your cpu at %03 and then as your little script is called the cpu will jump to %80 and then %95..
    If you have dual core, it will jump to %49 and then as a second script instance comes in, you'll see %99 or so again...

      Many thanks for this reply, that stops me wasting time optimising code etc! I may try and work out a 'ticketing' system, so when people want to create a PDF they can 'take a ticket' and wait in line for a slot. That way I can limit the number of pdf creation scripts in memory at any one time to two or three. I can then throw some more power at it, by upgrading the server from dual core to 8 or 16 core, and throwing more RAM at it.. then hopefully after my wallet recovers.. it should be ok!
        Now, I'm not sure about the specifics of your situation.
        But, if you are generating some kind of report.. that is.. maybe one person requests pdf output once a day or so.. You could do things like..

        Autogenerate the pdf output once a day (this is imagining you already know what the output is going to/should be )

        Maybe instead of letting them download it.. When the request the pdf.. it can be sent via email to them instead. You would have to keep track of it- that the user making a request does not already have a request in progress.

        Another thing I would look into.. (possibly.. again.. if it's worth the human time (yours)!) is named pipes (fifo), and maybe in conjunction with daemonizing a process (In essence, the request is put in a queue).

        Unfortunately pre-generating the PDF is not an option. The whole system is essentially a web2print solution, users login to the system, build how they want the pdf to look using an html/js interface, and then when they are happy with how it looks, they hit 'build'. At the moment with only a few users it is relativly uncommon for more than 2 requests to happen at any one time, however due to new clients I am having to make the system ready for a much larger scale. Depending on how much the system needs to be scaled I am thinking of a system whereby when a user wants to build a pdf, it puts an entry into a database, this will be their 'ticket'. I will then run the script as a daemon, which will check for an entry in the database, process it, create the pdf, then remove the entry, and wait until another 'ticket' is collected.
        Have you tried profiling the script? See Devel::DProf

        I may try and work out a 'ticketing' system
        System V semaphores are useful for that. See IPC::SysV (although there are other ways to do it).