in reply to Windows-specific: Limiting output of created files

I don't know of any Windows equivalent of ulimit -f. (ie. per process filesize limits.)

Your limiting tee idea seems the simplest to implement, but be aware that unless the writing process is checking for failing writes, it may not notice the pipe has gone away.

Eg: When the second instance of perl below terminates, the first instance blythly continues to write to the pipe. It doesn't cause any memory growth, but it does use a substantial amount of cpu until it decides to stop writing of it's own accord.

perl -E"say $_ for 1..100e6" | perl -E"while(<>){ $n += length; last if $n > 1024**2; warn qq[$n\n];} +"

If the first instance was checking for the success of it writes, it would notice the pipe going away:

perl -E"say or die $^E for 1 .. 100e6" | perl -E"while(<>){ $n += length; last if $n > 1e5; warn qq[$n\n]; }" ... 99990 99996 The pipe is being closed at -e line 1.

But if you can't change the process that's probably not useful info :(


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: Windows-specific: Limiting output of created files
by Anonymous Monk on Apr 03, 2009 at 17:01 UTC
    The tee could attempt to kill the sender after the limit is reached, or simply refuse to copy to output. This is antisocial behavior but it might cause less mayhem if Windows tries "naive" error handling routines when you close the incoming handle. (ok, maybe I should say "brain-dead")
      The tee could attempt to kill the sender after the limit is reached,

      The problem then is how tee finds out the pid of the producer process?

      or simply refuse to copy to output.

      It could, but then both processes might never end. Even if the producer process will eventually reach a natural end, it will take far longer, because of all the buffering and handshaking involved in the pipe.

      if Windows tries "naive" error handling routines when you close the incoming handle.

      If the producer application is just ignoring write failures, then there doesn't appear to be any 'naive error handling" involved. Indeed, the reason the cpu usage climbs after the pipe is severed, is because the producer application starts running much more quickly. As there is no buffering and handshaking involved, in the absence of any other limiting factors, its write loop just runs much more quickly than if the writes were succeeding.

      The problem is it might never finish.

      The nice thing about the piped-open overseer process is that it covers pretty much every eventuality. Save the possibility that the producer process stops writing, but doesn't close the pipe or terminate, before it reaches the limit specified in the overseer. Of course that can also be handled using a thread or can_read.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        The problem then is how tee finds out the pid of the producer process?

        The monitor could use a syntax similar to strace.

        limit_output 1_000_000 monitored_process.exe

        Having launched the monitored process puts limit_output in a position to monitor and control it.

        Update: Ah shoot. I think you already mentioned this.

        The problem then is how tee finds out the pid of the producer process?

        Hmmm... I'm doing now IPC::Run::run to invoke the command. Is there an asynchronous way to do this with IPC::Run (similar to using spawn, so that my main process continues to run and then explicitly wait for the subprocess to finish? In this case, I see a possibility to pass the pid to my tee process. The problem seems to be that IPC::Run on Windows offers only part of the documented features.

        -- 
        Ronald Fischer <ynnor@mm.st>