talexb has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to build a mod_perl handler that will spawn and keep open a Ghostscript session that will produce a page image of from a specified document. The first step is to run Ghostscript and get it to produce a single page image from a specified document. And it's been a challenge.

I am re-using some proprietary Perl code that currently works fine when running from within another script by using IPC::Open3 and using pipes to communicate with Ghostscript. Running the same code under mod_perl causes problems: mod_perl complains that it can't find method FILENO in package Apache::RequestRec, and sure enough with a little googling we find that you can't use IPC::Open3 under mod_perl.

My first choice was to use IPC::Run, which seemed to work fine in my test programs from the command line, but failed when run under mod_perl; right now it seems I'm stuffing data into the input scalar ref, but it's not getting read, and I'm not sure why. (Yes, I do pump every once in a while.) There is a debug switch for this module, but it sends the output to STDOUT, something that's not on in mod_perl.

Off to IRC, good old #perl on freenode, where on Friday Caelum suggested using Expect instead. I made up a test program, it worked a treat, the code's much cleaner, so today I plugged it in, and wound up with a similar error to the one with IPC::Open3: mod_perl hates it when someone tries to close STDIN, which is what Expect is doing.

So today, after a bit of googling I came up with this link which seemed to be what I was looking for .. unfortunately there's no code for me to look at :(

I initialize as follows:

my ( $in, $out, $timer ); $logger->debug("Calling start .."); $$self{_h} = start ( \@cmd, \$in, \$out, $timer = timer ( 5 ) ) or $logger->logdie("start returned $?:$!"); $logger->debug("Back from start .."); $$self{_in} = \$in; $$self{_out} = \$out; $$self{_timer} = \$timer;

I am sending the data like this:

$$self{_in} .= $thisLine . "\n"; $$self{_h}->pump_nb;

Along the way I check out length($$self{_in}), but to my dismay it does not decrease -- there is no action.

So at this point, it seems that IPC::Run will do the trick, if I can only figure out how to unstick the input pipe going to Ghostscript, an thus produce some output, which would be useful. Any suggestions welcome.

Alex / talexb / Toronto

"Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

Replies are listed 'Best First'.
Re: Trying to use IPC::Run under mod_perl
by perrin (Chancellor) on Dec 11, 2006 at 22:18 UTC
    I know I said this before, but I really think you should ask this question on the mod_perl list. I guarantee that other people there have used IPC::Run with mod_perl, as well as other IPC approaches with pipes.
      I responded on the mod_perl list too. My suggestion is to abandon trying to make this work from mod_perl. The best thing to do would be to create a daemon process that listens for connections from your mod_perl handler and returns either image data, or drops files someplace where you can send img src links to.

      This daemon would also be in charge of closing files left open for more than say 10 minutes (the user has moved on by that point).

      None of this is going to scale, by the way. Converting PDFs to images isn't really something you want to do on the fly. Sure, it'll work, for the most part, but unless you're creating the document dynamically and then also creating the thumbnails dynamically from that, it's going to be more efficient to convert all of your PDFs to images ahead of time.
Re: Trying to use IPC::Run under mod_perl
by jbert (Priest) on Dec 11, 2006 at 21:23 UTC
    If pipes to and from your sub-process are giving you so much pain under mod_perl, is there a reason why you don't just put the input in a file and run ghostscript so it reads from that file and puts the output in another?

    Are you trying to keep a ghostscript process running over the long-term, handling multiple requests, or something like that?

        If pipes to and from your sub-process are giving you so much pain under mod_perl, is there a reason why you don't just put the input in a file and run ghostscript so it reads from that file and puts the output in another?

      To get the thing running, I'm just doing one command (generate this page image), but eventually, yes, I'll be initializing a Ghostscript for a particular document, then coming back to it later to say, OK, now give me a page image for page 5. OK, I'm back, give me page 6. Me again, page 7 please.

      Eventually I'll have a hash holding pointers to waiting Ghostscript processes (one for each document) and some intelligence that goes around and says, "OK, you've waited long enough without anything to do, away you go." I can't go that until I can get IPC::Run figured out.

        Are you trying to keep a ghostscript process running over the long-term, handling multiple requests, or something like that?

      Yes.

      Alex / talexb / Toronto

      "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

      Uh, Actually yes, He said that in the first sentence. :)

      --Brig

        Ouch. Sorry about that. Buffer overflow? No, this is perl. Curses.

        Well, it seems likely that the problems are related to mod_perl's fun and games with stdin/stdout. <googles> Is this any good to you? "Apache2::SubProcess provides the Perl API for running and communicating with processes spawned from mod_perl handlers."

        And if it does all end up too painful, you can still use file-based queueing with a long-lived process. Have a droppoff dir for a work queue to the long-running gs and a pickup queue for finished pages. You'll need a little wrapper around ghostscript, but it would be started as an independent daemon.

        File-based queueing does involve some tedious details though:

        1. Be sure to write to a temp file and rename(), to avoid half-written files (on both queues)
        2. You have a new failure mode where the ghostscript daemon isn't running and stuff just builds up in the out queue.
Re: Trying to use IPC::Run under mod_perl
by shmem (Chancellor) on Dec 12, 2006 at 12:10 UTC
    mod_perl hates it when someone tries to close STDIN, which is what Expect is doing.

    You could just fool Expect... (untested w/mod_perl)

    #!/usr/bin/perl BEGIN { use IO::Handle; *Expect::STDIN = IO::Handle->new; require Expect; } # Expect closes STDIN in sub spawn my $exp = Expect->spawn('/bin/sh') or die "fubar: $!\n"; my $line = <STDIN>; print "got via STDIN: $line";

    This works fine standalone. Don't know whether that passes mod_perl's scrutinity, though. For multiple spawns a local *I_FOOL_EXPECT seems more suitable than an IO::Handle object (multiple closes).

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Trying to use IPC::Run under mod_perl
by jbert (Priest) on Dec 12, 2006 at 11:11 UTC
    Looking at IPC::Run in more detail, it seems that it sends its debug output to STDERR, which should go to your apache logs.

    If you've got stdout and stderr mixed up somehow, that might explain why IPC::Run is failing for you. Is there a chance of that?