in reply to Win32 limit to number of calls to system()?

The underlying cause is a Perl internal use of a system call WaitForMultipleObjects() which is limited to waiting on 64 objects at any given time.

From what I remember, the limitation is 64 concurrent forks. Once one completes, you can initiate another.

I forget all details, but if you'd show the basic layout of your code, I'd probably remember what you need to do to alleviate the limit.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re: Win32 limit to number of calls to system()?

Replies are listed 'Best First'.
Re^2: Win32 limit to number of calls to system()?
by Limbic~Region (Chancellor) on Sep 13, 2011 at 03:20 UTC
    BrowserUk,
    The code looks something like this:
    for my $pdf (glob('*.pdf')) { my $txt = convert_pdf_to_text($pdf); next if ! interesting($txt); # ... } sub convert_pdf_to_text { my ($file) = @_; my $abs_file = rel2abs(catfile(curdir(), $file)); ## Start Simpo PDF To Text system(1, 'C:\Program Files\Simpo PDF to Text\PDF2Text.exe'); # Locate the window my $wid = WaitWindow('Simpo PDF to Text', 5); die "Couldn't find 'Simpo PDF to Text' window" if ! defined $wid; ## Make sure it is on top SetForegroundWindow($wid); # Convert PDFs add_pdf($abs_file); convert(); # Close the application SendKeys('%{F4}'); my $txt_file = construct_txt_file($file); return '' if ! -r $txt_file; my $data = read_file($txt_file); unlink $txt_file or die $!; return $data; }
    As you can see, I use system(1, $app) to start the application and alt+f4 to close the application (before starting the new one). I have already worked around the problem by leaving the app open. Just not sure why this doesn't work as I would expect.

    Cheers - L~R

      You don't reap. Use waitpid($pid), the pid being returned by system 1.

      Either of these will work:

      1. Use a synchronous system and the start command. This will synchronously run a copy of cmd.exe to run the start command, and it starts the program asynchronously.

        As cmd.exe returns immediately and the synchronous system gathers its exit code, it avoids the accumulation of zombies and the WaitForMultipleObjects() problem:

        for ( 1 .. 100 ) { print "spawning job $_"; system 'start \\Windows\\system32\\notepad.exe'; my $wid = WaitWindow( 'Notepad', 1 ); SetForegroundWindow( $wid ); SendKeys( '%{F4}' ); }
      2. Use the asynchronous system and obtain the pid of the started instance from the returned value.

        Use waitpid to gather the exit code thus avoiding the accumulation of the zombies:

        for ( 1 .. 100 ) { print "spawning job $_"; my $pid = system 1, '/Windows/system32/notepad.exe'; my $wid = WaitWindow( 'Notepad', 1 ); SetForegroundWindow( $wid ); SendKeys( '%{F4}' ); waitpid $pid, 1; }

      Finally, 'Simpo PDF to Text' has a 'batch mode' which would be possible -- if awkward -- to drive programmically, but it might be substantially more efficient.

      I guess you've already looked at command line driven alternatives?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        BrowserUk,
        Finally, 'Simpo PDF to Text' has a 'batch mode' which would be possible

        That is one of the reasons I purchased it. Unfortunately, it is 'batch mode' through the UI and not command line. For my purposes, I need to make a decision after conversion before the next file that does not easily allow me to use the batch mode.

        I guess you've already looked at command line driven alternatives?

        Nothing I looked at did as good a job as Simpo's tool on the particular PDFs I was using. I have used a multitude of OCR tools in the past but I need to keep structure as well as text and it was the best I found for this particular data set.

        Cheers - L~R