in reply to Re: Extracting tables from PDF
in thread Extracting tables from PDF
Unfortunately, this is still pretty ugly... Tables do end up displaying properly in "complex document" mode, but that's just because it puts every element in a <div> and positions it with style=position:absolute. Whether it's in normal mode or complex document mode, there's nary a <table> tag in sight.
I also found a message in one of the project forums where the author tells someone else,
There is no concept of tables in PDF. When you see a table in a PDF file, it's just a bunch of text positioned in particular places and a bunch of lines. There is no simple way to translate tables from PDF to HTML or anything else.Granted, the post was from mid-2004, but, unless that's changed, this looks very not-promising.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Extracting tables from PDF
by Dervish (Friar) on Jul 13, 2007 at 04:39 UTC | |
|
Re^3: Extracting tables from PDF
by Anonymous Monk on Jul 13, 2007 at 04:11 UTC | |
by dsheroh (Monsignor) on Jul 13, 2007 at 05:47 UTC |