Tanktalus has asked for the wisdom of the Perl Monks concerning the following question:

Sometimes, in an automated process, we get surprised with some output that we, um, didn't properly trap. As in, "this could never fail, so..." Famous last words. So, as a self-diagnostic aid, I figured I'd just replace my script with a wrapper script that redirected stdout and stderr both to some file on the shared disk that everything else is put into. That way, when I'm looking at everything else that is generated, I can see any error/warning messages that perl may put out which may help explain why something didn't occur that I thought it should.

Then I realised that the output directory is extremely variable. And that I won't know it until well after the first text is printed out. "Simple," says I. "I'll just use another module that delays output until I tell it where to output, and store it until then." Unfortunately, I can't seem to find IO::File::Delayed on CPAN. Perhaps this isn't such a common requirement afterall.

So I'm thinking of creating something derived from IO::Handle where, if a file is given, we open and write to it. If no file is given, we just append any text to an internal scalar (or push on an internal array?). Once a file name is given, we open and write out all our buffered data. Maybe a "pause" or "stop" function which closes the file, but allows us to continue writing to the filehandle.

Two questions: first, did I miss something on CPAN or the like? Second, if not, does this sound reasonable?

Thanks,

Replies are listed 'Best First'.
Re: Delayed-write object/module
by BrowserUk (Patriarch) on Feb 14, 2006 at 05:13 UTC

    Using a ramfile (or maybe IO::String for pre-5.8) for your buffering saves a lot of complication. This only requires two methods and 10 lines of code:

    #! perl -sw package IO::Delayed; sub new { my $class = shift; ## Create an anonymous glob my $self = do{ local *GLOB; \*GLOB }; ## Use it to hold both the both the filehandle ## and a ramfile open *$self, '>', \${ *$self } or die $!; ## bless and return it return bless $self, $class; } sub setPath { my( $self, $path ) = @_; ## Re-use the glob for the real file open $self, '>', $path or die $!; ## Output the buffered data print $self ${ *$self }; ## Free up the storage undef ${ *$self }; return; } return 1 if caller; package main; my $log = IO::Delayed->new; ## create a rambuffered filehandle print $log "$_\n" for 1 .. 10; ## log some stuff to it $log->setPath( 'mylog.out' ); ## Now redirect it to disk print $log "$_\n" for 100 .. 110; ## and it becomes a normal filehandl +e __END__ C:\test>junk4 C:\test>type mylog.out 1 2 3 4 5 6 7 8 9 10 100 101 102 103 104 105 106 107 108 109 110

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Delayed-write object/module
by davidrw (Prior) on Feb 14, 2006 at 02:01 UTC
    can your wrapper just be something like this? i.e. does it really need to figure out the final directory during or can it wait until after?
    #!/bin/bash out=/tmp/wrapper.$$ perl some_script.pl > $out 2>&1 # run your script, trapping everythin +g dest=`perl -x $0 $out` # examine logfile, figure out right place mkdir -p ${dest%/*} # make sure the directory exists mv $out $dest # move the log into place exit ############################### #!/usr/bin/perl use strict; use warnings; my $filename = $ARGV[0]; my $output; # read in $file, figure out where it should go; set $output print $output;
    (note that the perl code could also use File::Copy's move() as well; and also note that the bash script could check the exit code of the perl call;)

    Update: Hmm.. re-reading OP, i see "some file on the shared disk that everything else is put into" .. So i guess the mv should be an append .. and i guess my main question gets re-phrased as "do you need to append as you go (interlacing w/other logs) or can it append as one big chunk at the end?

      I think you read it correctly the first time. We put everything into /nfsdisk/run-specific/path/here. The logs get put into /nfsdisk/run-specific/path/here/Logs. There are actually multiple logs. I'm just planning to put the stdout and stderr into /nfsdisk/run-specific/path/here/Logs/script-name.out.

      My backup plan was to do something mostly similar to what you have - just update the perl code to print "Logs directory: /nfsdisk/run-specific/path/here" somewhere in its output, grep, cut, and append the outfile name. Then cp or mv it over. However, I thought it'd be really neat if I could do away with some of my intermediate files because then I could actually watch my code's progress as it is output onto the NFS disk from another machine while it is running under someone else's account (I don't have permission to write to the disk - so someone else runs it for me).

      This would also be somewhat handy if I could kick it off for some other tracing and logging. Right now, I need to execute probably an equivalent of about 10,000 lines of code or so before I can turn on tracing or logging. I obviously can't trace that execution. If I could have the tracing and logging write through IO::File::Delayed objects, then when I set the log path (which is where my tracing goes as well, if it's on), all the log and trace entries thus far would be immediately written out, and I'd have the full stack of what is going on. As it is, I can't trace anything until I've parsed enough config files and command line params to definitively locate where the trace goes, then I can continue parsing config files and command line params with trace enabled.

      Now, before someone gasps at "10,000 lines" - think of this: put your configuration into XML files, and use Getopt::Long for your command line parsing, and platform detection to figure out which XML files to use, and how to interpret them. To read, parse, and decode all that, you'll run through thousands of lines of code - just not all your own. And if you count going through the same line of code multiple times as multiple lines of code being executed, well, 10,000 shouldn't seem unreasonable. Just unorthodox way of counting lines. I'm talking about "executed" lines, not "unique" lines.