in reply to Re: How to Extract PDF tables using Perl
in thread How to Extract PDF tables using Perl

Thank you for your precious time Morgon.
I will look into it,
But my People say its possible.
And they have done it.
  • Comment on Re^2: How to Extract PDF tables using Perl

Replies are listed 'Best First'.
Re^3: How to Extract PDF tables using Perl
by hippo (Archbishop) on May 11, 2016 at 10:20 UTC
    But my People say its possible. And they have done it.

    Ask them how they did it and then do it that way. Problem solved.

      And then wrap that "how" up into a CPAN module. :-)

      --MidLifeXis

        Possibly if there was any Perl module to replace pdftohtml -xml , ie to get character clusters by positions.

        Then one could try to combine histograms of word positions with further user hints like font or area of position.

        Not sure if CAM::PDF can be used to get word positions, the docs only mention "objects", whatever that means...

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Je suis Charlie!

Re^3: How to Extract PDF tables using Perl
by perlPsycho (Initiate) on May 11, 2016 at 10:12 UTC
    Is there any one who knows whether
    there is a perl module
    That can be used for Extracting Table from PDF
    And
    How to Do it?