karden has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I am using IPC::Open2 to interact with an external program. My perl file is executed for every document in a large collection. Each time my file is run, a new connection is made with the external program (via open2), initialization commands are sent to the program and then the connection is closed which all consume, albeit not too much, certain amount of time when aggregated.

What I am hoping for is: my file calls another file which creates (only if not created before) and returns the read/write handles with an open2 command.

Am I looking for something like PHP's require_once?

Replies are listed 'Best First'.
Re: batch processing via single open
by daxim (Curate) on Jul 15, 2007 at 23:22 UTC
    You're talking quite in the abstract here. What prevents you to implement exactly what you've said in the second paragraph? Do you know about subroutines? There's hardly a reason to have a perl file (program) call another one in a new process.
      Because I don't know how to do it. How can a piece of code understand if there already exists a connection to an external program so that it can re-use it without opening a new one thus saving time?
        our $ALREADY_CONNECTED = 0; # this var's purpose is called a semaphore DOCUMENT: for my $document (@collection) { if (!$ALREADY_CONNECTED) { open2(...); # and save the handles for later usage $ALREADY_CONNECTED = 1; # whoa! magic! redo DOCUMENT; } else { # blah blah process $document here, # the in/out handles exist. }; };
Re: batch processing via single open
by shmem (Chancellor) on Jul 16, 2007 at 07:43 UTC
    Am I looking for something like PHP's require_once?
    You are looking for require. If a file is succesfully imported via require, it won't be imported again during the program's lifetime. On subsequent calls require'ing the same file just returns 1 (or some other bizarre but "true" value).

    From your post the calling semantics are not clear to me. "My perl file is executed for every document in a large collection" - how is this perl file invoked? from a shell, from perl itself? If you are invoking perl anew for each file in the set (and exiting perl when done) you won't gain much with require.

    If you showed some code it would be easier to help you. See How (Not) To Ask A Question.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      Thank you shmem.

      I am to run this perl file from shell for every file in a large corpus (I have to, because this is only a small part of a large project. Execution continues with various other languages and scripts and so on afterwards), not from another perl file. So daxim's highly sophisticated code is unfortunately useless to me.

      As you said, "require" would not bring me much advantage because upon finishing execution, even if I do not close() the connection, it will be auto-closed. And while processing the next file, same perl code will be executed and also "require"d file will be executed again.

      Roughly, I want to have the following:
      db.pl

      open2 (\*INP, \*OUTP, 'path_to_external_program'); print "included"; 1;

      maincode.pl
      require "db.pl"; $file = $ARGV[0] #the next file to process print OUTP some_command_using_$file; #exec some cmds on external prog #more lines follow afterwards

      I want my maincode.pl to function for every file, as well as db.pl to print "included" only once. So that for the whole collection of thousands of documents, I will have only single connection throughout the batch execution. Possible in a way?
        Possible in a way?

        OF course yes! That's why I asked about calling semantics. Which part is gathering the collection of files to iterate over? the shell script? do you want perl to gather the files? what system are you on?

        <handwaving style="amount: lots"> If you are gathering the file via a shell script, you could do something along these lines

        ( while read command param; do # whatever method files=`command param` # to gather the files for file in $files; do # process files echo $file done done) | maincode.pl

        and in maincode.pl

        #!/usr/bin/perl use IPC::Open2; my $cmd = 'whatever'; # really, I have no idea what you are doing $pid = open2(\*CHLD_OUT, \*CHLD_IN, $cmd) or die "oops: $!\n"; while(<STDIN>) { chomp; my $file = $_; # now do whatever with $file. open(I,'<', $file); while (my $line = <I>) { ... # do whatever with each line in the file } }

        but since I still don't know what you're up to, I can't give proper advice. See I know what I mean. Why don't you?

        Maybe you really want some client/server stuff, or the perl code to act as a stream filter, and dispatch it's output to somewhere else. Each usage has different semantics; how can I know which are required, without a bit more of explanation from you?

        --shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}