Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all and a nice Sunday :-)

There is a non-Perl-program pdftohtml to convert pdf to html (and I want to convert that hereafter to txt).
I can do that like this:

	system "pdftohtml x.pdf, x.html";
	system "myPerlPrg.pl x.html x.txt";
not very elegant :-(
I would like to do that without using that temporary html-file x.html.

- Either by getting the pipe into the invoking-prg (? possible, how?):
	..
	system "pdftohtml x.pdf |";
	# now read this pipe | (???)
Or to to pipe the output of pdftohtml directly to another Perl-prg:
	system "pdftohtml x.pdf | myPrg.pl"
but how do I know about and open this pipe within myPrg.pl? I don't find it in @ARGV, correct?

Thanks in advance,
carl

Replies are listed 'Best First'.
Re: open a pipe from system..
by Tanktalus (Canon) on May 01, 2005 at 15:26 UTC

    You have the terminology right - you want to open the filehandle:

    open my $fh, 'pdftohtml x.pdf |' or die "Can't run pdftohtml: $!";
    However, that assumes that pdftohtml will spit its output to stdout rather than a file - I don't know this.

    The other alternative is for myPerlPrg.pl to fake this by having it take the x.pdf file, create the x.html file, and read it in a single execution. The command line would look like:

    myPerlPrg.pl x.pdf x.txt
    You could create a temporary filename, pass that to pdftohtml, and then use that to read the HTML to convert to text.

    Hope that helps,

      ehem?

      I can run pdftohtml without using either system or exec?

      It means that

      open my $fh, 'pdftohtml x.pdf |' or die "Can't run pdftohtml: $!";
      1st runs pdftohtml?
      2nd creates a file handle (in var $fh)?
      3rd I can read the output of pdftohtml from this $fh?

      Thanks a lot, going to try it :-)

      Carl

        thanks a lot!

        This works perfectly:

        open my $fh, "pdftohtml -noframes -stdout $pdf |" or die "Can't run +pdftohtml: $!"; while (<$fh>) { &func($_); } close($fh);
        option -stdout is needed (in case s.o.else is searching) to catch the output.

        have a nice Subday
        Carl