Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

differentiate pdf and ppt

by sarvan (Sexton)
on Aug 31, 2011 at 10:35 UTC ( [id://923394]=perlquestion: print w/replies, xml ) Need Help??

sarvan has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

Is there any way to differentiate pdf from ppt by analyzing the content of those documents. Because,it is possible for a pdf document to have ppt content and also ppt to have pdf content.

So, i need to know any clues from the document can be used to differentiate. Because in my work i will get a url. From that url i need to make sure that the document and its content is pdf or ppt..

Need help in this plz... Thanks.

Replies are listed 'Best First'.
Re: differentiate pdf and ppt
by Anonymous Monk on Aug 31, 2011 at 10:45 UTC
Re: differentiate pdf and ppt
by bart (Canon) on Aug 31, 2011 at 11:37 UTC
    If you read the first few bytes of a PDF file, it always starts with "%PDF" and the version number. I think that is all you need.
      Hi bart, Can you please tell me how can i do that.
        Sheesh, I would think that even any beginner in Perl should be able to tackle this. The tasks you have to do:
        1. read the first few bytes of the file (at least 4 bytes) into a string
        2. see if the string starts with '%PDF'

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://923394]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-19 06:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found