http://qs1969.pair.com?node_id=925488

Limbic~Region has asked for the wisdom of the Perl Monks concerning the following question:

All,
I am working on a project on Windows XP 32 bit using ActiveState Perl 5.12. The project involves converting PDF files to text using an external application that doesn't have a command line variant. For this reason, I am using Win32::GuiTest++.

The code screams along until it has converted 64 PDFs then fails. It fails rather silently (just doesn't open the application). It took me quite a while to discover that it was failing after the same number of PDFs each time but once I did, I began to wonder - why the limit? I am closing the 3rd party conversion application using alt+f4 if that matters.

I intend to work around the limit by not closing the application. I didn't do this originally because the application provides no menu or short cut keys and needs to be automated by moving the mouse (absolute pixel positions). I was just wondering - is this limit documented somewhere? Also, is there a way to work around it?

Cheers - L~R

  • Comment on Win32 limit to number of calls to system()?

Replies are listed 'Best First'.
Re: Win32 limit to number of calls to system()?
by BrowserUk (Patriarch) on Sep 12, 2011 at 15:49 UTC

    The underlying cause is a Perl internal use of a system call WaitForMultipleObjects() which is limited to waiting on 64 objects at any given time.

    From what I remember, the limitation is 64 concurrent forks. Once one completes, you can initiate another.

    I forget all details, but if you'd show the basic layout of your code, I'd probably remember what you need to do to alleviate the limit.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      BrowserUk,
      The code looks something like this:
      for my $pdf (glob('*.pdf')) { my $txt = convert_pdf_to_text($pdf); next if ! interesting($txt); # ... } sub convert_pdf_to_text { my ($file) = @_; my $abs_file = rel2abs(catfile(curdir(), $file)); ## Start Simpo PDF To Text system(1, 'C:\Program Files\Simpo PDF to Text\PDF2Text.exe'); # Locate the window my $wid = WaitWindow('Simpo PDF to Text', 5); die "Couldn't find 'Simpo PDF to Text' window" if ! defined $wid; ## Make sure it is on top SetForegroundWindow($wid); # Convert PDFs add_pdf($abs_file); convert(); # Close the application SendKeys('%{F4}'); my $txt_file = construct_txt_file($file); return '' if ! -r $txt_file; my $data = read_file($txt_file); unlink $txt_file or die $!; return $data; }
      As you can see, I use system(1, $app) to start the application and alt+f4 to close the application (before starting the new one). I have already worked around the problem by leaving the app open. Just not sure why this doesn't work as I would expect.

      Cheers - L~R

        You don't reap. Use waitpid($pid), the pid being returned by system 1.

        Either of these will work:

        1. Use a synchronous system and the start command. This will synchronously run a copy of cmd.exe to run the start command, and it starts the program asynchronously.

          As cmd.exe returns immediately and the synchronous system gathers its exit code, it avoids the accumulation of zombies and the WaitForMultipleObjects() problem:

          for ( 1 .. 100 ) { print "spawning job $_"; system 'start \\Windows\\system32\\notepad.exe'; my $wid = WaitWindow( 'Notepad', 1 ); SetForegroundWindow( $wid ); SendKeys( '%{F4}' ); }
        2. Use the asynchronous system and obtain the pid of the started instance from the returned value.

          Use waitpid to gather the exit code thus avoiding the accumulation of the zombies:

          for ( 1 .. 100 ) { print "spawning job $_"; my $pid = system 1, '/Windows/system32/notepad.exe'; my $wid = WaitWindow( 'Notepad', 1 ); SetForegroundWindow( $wid ); SendKeys( '%{F4}' ); waitpid $pid, 1; }

        Finally, 'Simpo PDF to Text' has a 'batch mode' which would be possible -- if awkward -- to drive programmically, but it might be substantially more efficient.

        I guess you've already looked at command line driven alternatives?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Win32 limit to number of calls to system()?
by chrestomanci (Priest) on Sep 12, 2011 at 15:48 UTC

    It appears there is a limit on the number of children perl can have under windows:

    http://code.activestate.com/lists/perl-win32-users/12064

    However that thread says the limit only applies to concurrent threads that perl can wait on. Perhaps your code is creating child processes to convert each PDF, and then not waiting on each process, but is leaving zombies.

Re: Win32 limit to number of calls to system()?
by Anonymous Monk on Sep 12, 2011 at 15:26 UTC
      Anonymous Monk,
      That certainly seems plausible but it would mean shame on Windows for not actually releasing resources when the program is shut down (unless using alt-f4 is a bad way to close the program). Perhaps before I write all the code to keep the app open, I will try closing it through the X button.

      Cheers - L~R

Re: Win32 limit to number of calls to system()?
by cdarke (Prior) on Sep 12, 2011 at 16:15 UTC
    Since it appears to be a per-process limit, the way I have got over the limit in the past is to split the workload over a number of other processes. For example, with 128 jobs to run, spawn 4 programs which just run 32 jobs each.

    Not elegant, I know.
A reply falls below the community's threshold of quality. You may see it by logging in.