daverave has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I am trying to execute bp_genbank2gff3.pl (bioperl) from inside a perl script. I just need it to convert a file between formats and I don't read the resulting files later in my script, but pass them to a java program. This is why I think it's OK to execute this script as-is (generally, I guess it would be "prettier" to call some bioeprl subroutine that does the same job as the script does, but I couldn't find a simple one).

Anyway, I am using a general helper subroutine that I wrote (run_command) as described below.

use strict; use warnings; use Cwd; use 5.010; # run command and print output to stdout, file, both or none. # $print_to_stdout is boolean, $output_filename is optional sub run_command { my ( $command_string, $print_to_stdout, $output_filename ) = @_; say "### run_command (from ", getcwd(), " ): ", $command_string, " ### +"; # create output file if needed my $output_fh; if ( defined $output_filename ) { open $output_fh, '>', $output_filename or die "can't open $output_filename: $!"; print $output_fh "### run_command: ", $command_string, " ###\n"; } # execute command open( my $command_out, "-|", $command_string ); if ( $print_to_stdout || defined $output_fh ) { while (<$command_out>) { print if ($print_to_stdout); print $output_fh $_ if ( defined $output_fh ); } } close $command_out; # wait until command has finsihed close $output_fh if ( defined $output_fh ); }

And define:

my $command = "bp_genbank2gff3.pl -y -o out_dir some_genbank_filename" +;
(replace out_dir and some_genbank_filename with real names)

Now, any of these will work fine (the files will be created in the output dir):

run_command( $command, 1, undef ); run_command( $command, 0, "some_file_name"); run_command( $command, 1, "some_file_name");
but this will not work (no error but also no files created):
run_command( $command, 0, undef );
Why? It's as if the script knows whether I'm keeping track of it's stdout.

Thank you, Dave

Replies are listed 'Best First'.
Re: Problems executing a (bioperl) script from another script
by jethro (Monsignor) on Aug 13, 2010 at 10:53 UTC
    if ( $print_to_stdout || defined $output_fh )

    Since both are false, you never read the output from the command. If there are a few lines of output before the script bp_genbank2ff3.pl generates any files, it might just wait for the (line oriented) output buffer to be empty before continuing (I don't know exactly without trying it out, but seems a reasonable assumption). You can check that easily by just removing the if clause , i.e. just do this

    # if ( $print_to_stdout || defined $output_fh ) { while (<$command_out>) { print if ($print_to_stdout); print $output_fh $_ if ( defined $output_fh ); } #}
      If both are false I indeed don't want to read the output from the command. In any case, the command should generate files. What it outputs to stdout are just log messages.
Re: Problems executing a (bioperl) script from another script
by daverave (Scribe) on Aug 13, 2010 at 12:42 UTC
    OK, I simplified things as much as I can to narrow down the problem. This works:
    open( my $command_out, "-|", $command ); sleep 3; close $command_out;
    and this does not:
    open( my $command_out, "-|", $command ); close $command_out;
    Why?
    Isn't `close` suppose to block until the command is done?
    Closing any piped filehandle causes the parent process to wait for the child to finish... (from open)
      This is the last paragraph in the man page that you get when you run "perldoc -f close":
      Prematurely closing the read end of a pipe (i.e. before the process writing to it at the other end has closed it) will result in a SIGPIPE being delivered to the writer. If the other end can’t handle that, be sure to read all the data before closing the pipe.

      I think this is relevant to your case. You don't seem to be reading anything from the pipe file handle after you open it, so maybe you don't really want to use open( $fh, "-|", $command ) in this case -- use a system call instead.

      If you actually do need to read output from the command, just read from the file with the usual  while (<$fh>) {...} idiom, or use "slurp" mode on the pipe file handle. In either case, EOF will be detected when the sub-process finishes -- just don't close it before that happens.