in reply to PDF to Text

what is "sufficiently inaccurate"??
i used pdftotext and it worked just fine in getting ascii text from many pdfs.
the hardest line to type correctly is: stty erase ^H

Replies are listed 'Best First'.
Re^2: PDF to Text
by chrism01 (Friar) on Jan 27, 2005 at 01:47 UTC
    It works on most lines, but occasionally gets confused and outputs data in a different layout from the original.
    I really need it to be accurate because even the original is fiddly to deal with.
    It's basically 2 sets of columns of variable blocks of data, that also wrap around from the bottom of the left-hand column to the top of the right hand column on each page, then wraps to the top of the left-hand column on the next page etc ...
    eg (short example):
    name1 1,2,3,4 8,9,10 name3 1,2,3,4,5 name 1,2,34, name4 1,2,3,4 5,6,7,
    but what i sometimes get is:
    name1 1,2,3,4 8,9,10 name3 1,2,3,4,5 1,2,3,4 name4 name 1,2,34, 5,6,7
    Ther's a lot more of this ... also, the separations between names, nums, left, right cols are variable...