Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Re: Re: Reading PDF files

by waswas-fng (Curate)
on Jul 21, 2003 at 15:42 UTC ( [id://276323]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Reading PDF files
in thread Reading PDF files

Depending on the app that created such PDF and the settings/fonts used, you may end up with a pdf that is a bunch of font character bitmaps in sequence or blocked with no underlining text information. Most adobe applications will enbed textual versions in the PDF so the text selection tool can be used to grab plain text from segments in the pdf. I think the point is unles you can be certian how the pdf is generated and you are comfortable with them -- parsing data from them is going to be a large pain in the butt.

-Waswas

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://276323]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (6)
As of 2024-04-23 21:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found