Re^3: How to Extract PDF tables using Perl

But my People say its possible. And they have done it.

Ask them how they did it and then do it that way. Problem solved.

Comment on Re^3: How to Extract PDF tables using Perl

Replies are listed 'Best First'.
Re^4: How to Extract PDF tables using Perl by MidLifeXis (Monsignor) on May 11, 2016 at 11:07 UTC
And then wrap that "how" up into a CPAN module. :-) --MidLifeXis	[reply]
Re^5: How to Extract PDF tables using Perl by LanX (Saint) on May 11, 2016 at 11:33 UTC
Possibly if there was any Perl module to replace `pdftohtml -xml` , ie to get character clusters by positions. Then one could try to combine histograms of word positions with further user hints like font or area of position. Not sure if CAM::PDF can be used to get word positions, the docs only mention "objects", whatever that means... Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!}	[reply] [d/l]