|No such thing as a small change|
PDF decoding in Perlby Arik123 (Beadle)
|on Mar 06, 2017 at 07:17 UTC||Need Help??|
Arik123 has asked for the wisdom of the Perl Monks concerning the following question:
I have a PDF file which contains a filled form. Unfortunately the information (text-only) isn't plain ASCII. I nned a perl script to extract the information and process it, but I can't get anything except gibberish. I figured it's condensed somehow, so I used QPDF to make the file more human-readable.
Now there are multiple objects whose content is something likefeff05e405e805d905d8002e002e002e
which seem to be the content of the fields, in some encoding. There are also some objects that look like:
while the /ToUnicode information refes to objects that look like:
I need some perl script (or a module) that can make sense of all that (to me it looks like Turkish. Hint: I don't speak Turkish) and convert it to utf-8 or some other encoding that makes sense.
Any help would be appreciated.