in reply to 3 weeks wasted? - will threads help?

Looking at your code, you appear to be processing one the other or both of the directories 'in/*' & 'out/*' relative to the path "/var/spool/wt400/gateways/" . $ARGV[0].

Presumably each of your 20 copies of the script is processing a different subtree of /var/spool/wt400/gateways/?

In which case, you could do your initial chdir to and to your globing as <*/in/*> etc. and process all the files from the 20 subdirs in one loop. I notice that you have a sleep 3 in your main loop, which probably means that your not utiliting much of the cpu as it stands, so you should have easily enough processor to cope with the 20 dirs in the main loop. You might need to change that sleep to

sleep 3-$time_spent_last_pass.

I realise that the traplist file is different for each subtree, but the <*/in*> form of the glob return the filenames in the form subdir/in/file so you can then split on the /'s and extract the subdir and use this as the first key in your %traps hash to select the appropriate set of traps information for the file.

It means re-working your code a somewhat, but probably less work than moving to either use Threads; or fork.


Examine what is said, not who speaks.

The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Replies are listed 'Best First'.
Re: Re: 3 weeks wasted? - will threads help?
by Limbic~Region (Chancellor) on Jan 27, 2003 at 23:42 UTC
    It's just not possible BrowserUk

    The transient files are creating with a naming syntax so that they are automatically listed in oldest to newest from the glob. It can take up to two seconds to process a single directory, but once I am done - I have a fair amount of assurance (I will change the sleep statement if it is too much) that I can wait 3 seconds before I parse the directory again. I can't wait over half a minute, which is the possibility of happening if I parse all directories at one time.

    I really need to parse each directory as if it were the only one.

    Cheers - L~R

      Why not glob, then fork and proccess the globbed data in the child while sleeping 2 seconds in the parrent and looping all over again? also to answer you question about forking above check out your system's man page for fork(2) I am pretty sure HPUX using copy-on-write (only copies the page stack and changed mem locs) since 10.x

      -Waswas
        BrowserUK's solution was not to fork each globbed directory, but to create one large glob of all the directories. This is what I am claiming is not feasible.

        I admit that I thought the enormous sz from ps in each child processes I forked came from its own instance of the Perl intrepreter, but I never claimed that HPUX didn't use copy-on-write.

        My problem is I have no way of profiling it - how can I tell how much memory is really being used and how much is shared.

        I have thought of a few more ways to optimize speed and memory allocation of the original code, but that won't get rid of the overhead I mentioned by my example of just a simple script.

        #!/usr/bin/perl -w use strict; while (1) { print "I am only printing and sleeping\n"; sleep 1; }

        That tiny program shows up in ps with a sz comparable to my full blown script.

        If I can't tell how much of that is shared by fork'ing another process - I have no idea if the project is viable or if it should be scrapped.

        Now your proposal is a tad bit different from the others as your forked children die returning all memory to the system, they are just spawned each iteration, which means the memory is MORE available (during sleep) to the system and since all the variables will be pretty stagnant once the child is forked - it won't start getting dirty before its dead. This is food for thought.

        Thanks and cheers - L~R