in reply to [SOLVED] Capturing errors from 3-arg pipe open in ActivePerl 5.020

The problem is that perl is starting cmd.exe that is then asked to run the non-existent command. So the pid is for the shell instance. (Watch the taskmanager to see this is happen.)

To avoid that problem you can try passing the fully qualified pathname of the command you want to run.

Without the command being fully qualified, 'caeser' might be caeser.exe or caeser.bat or caeser.cmd or caeser.pl or caeser.vb etc. And it might be in the current directory, or somewhere in the path or.... Rather than Perl having to emulate all of the possibilities, unqualified commands are passed to the shell to do what it does.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re: Capturing errors from 3-arg pipe open in ActivePerl 5.020

Replies are listed 'Best First'.
Re^2: Capturing errors from 3-arg pipe open in ActivePerl 5.020
by ateague (Monk) on Nov 16, 2015 at 18:30 UTC

    Thank you. That certainly makes sense. I feel though I should probably back up and clarify the root problem though.

    I am working with a collection of PDF files and am using the pdftohtml.exe program to convert the PDF into an XML stream in order to extract text of interest with XML::Twig:

    open (my $XML, "-|", "e:\\path\\to\\pdftohtml.exe -xml -zoom 1.4 -stdo +ut $PDF_FILE") or die "pdftohtml failed:\n$!\n$^E"; my $t = XML::Twig->new( twig_handlers => { '/pdf2html/pagetext[(@top >= 180 and @top <= 190) and (@left > += 100 and @left <= 111)]' => \&RouteTo, '/pdf2html/pagetext[(@top >= 215 and @top <= 225) and (@left > += 260 and @left <= 270)]' => \&InvoiceSort, '/pdf2html/page' => sub { $_[0]->purge; 1; }, # free memory af +ter every page }, comments => 'drop', # remove any comments empty_tags => 'normal',# empty tags = <tag/> ); $t->parse($XML); close $XML;

    The problem is that if I fat-finger the open command (e.g. type "-zom" instead "-zoom" in the command arguments), or if "$PDF_FILE" could not be found, the program merrily continues on its way, unaware that $XML is undefined. I've been working around this by wrapping the "$t->parse" in an eval block to catch this, but I was wondering if there was a better way.

      1. The problem is that if I fat-finger the open command (e.g. type "-zom" instead "-zoom" in the command arguments)

        You ought to detect that kind of error the first time you test your script; so correct the typo.

      2. or if "$PDF_FILE" could not be found, the program merrily continues on its way, unaware that $XML is undefined.

        This kind of depends on what the executable does in that situation. I'll assume it does the sensible thing of outputting an error message then exits with a non zero exit code.

        Normally, if you were reading the pipe yourself, the first time you attempted to read it would get a end of file (with a pipe abandoned status) and you could then call waitpid on the pid returned by the open, and check $? to obtain the exit code and status.

        As you are passing the filehandle into a module, the simplest check would be to call eof on the filehandle before you give it to XML::Twig; and if there's nothing to read, don't pass it on; just waitpid and check $?

      It can get more complicate if the executable is one of those that tries to be 'helpful' and hangs around rather than just exiting on error; but let's assume it's not :)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
      In the absence of evidence, opinion is indistinguishable from prejudice.
        As you are passing the filehandle into a module, the simplest check would be to call eof on the filehandle before you give it to XML::Twig;

        That certainly did the trick, thank you very much!

        pipe.pl
        #!/usr/bin/perl use 5.020; use strict; use warnings; open (my $ARTICLE, "-|", "caesar"); eof $ARTICLE and die "Can't start caesar:\n$!\n$^E"; my $read = <$ARTICLE>; say "[$read]";
        Results:
        perl pipe.pl 'caesar' is not recognized as an internal or external command, operable program or batch file. Can't start caesar: Inappropriate I/O control operation The handle is invalid at pipe.pl line 7.
      ...eval block to catch this, but I was wondering if there was a better way.
      From the XML::Twig docs:

      safe_parse ( SOURCE [, OPT => OPT_VALUE ...])

      This method is similar to parse except that it wraps the parsing in an eval block. It returns the twig on success and 0 on failure (the twig object also contains the parsed twig). $@ contains the error message on failure.

      Note that the parsing still stops as soon as an error is detected, there is no way to keep going after an error.

              “The sources of quotes found on the internet are not always reliable.” — Abraham Lincoln.3; cf.

        Thank you for that tip.

      If I'm understanding correctly, you're basically wanting to call another program from your Perl code and capture the STDOUT and STDERR of that program so that your code can determine if the program ran successfully or encountered errors. Is that correct?

      If the description above is correct, then my approach would be to leverage Capture::Tiny instead of using the piped open construct.

        If I'm understanding correctly, you're basically wanting to call another program from your Perl code and capture the STDOUT and STDERR of that program so that your code can determine if the program ran successfully or encountered errors. Is that correct?

        No, not quite. I am wondering why the 3-arg pipe open example provided by Perldoc does not work as expected.

        (N.B. I have updated the OP to clarify this)