Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

differentiate pdf and ppt

by sarvan (Sexton)
on Aug 31, 2011 at 10:35 UTC ( #923394=perlquestion: print w/replies, xml ) Need Help??

sarvan has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

Is there any way to differentiate pdf from ppt by analyzing the content of those documents. Because,it is possible for a pdf document to have ppt content and also ppt to have pdf content.

So, i need to know any clues from the document can be used to differentiate. Because in my work i will get a url. From that url i need to make sure that the document and its content is pdf or ppt..

Need help in this plz... Thanks.

Replies are listed 'Best First'.
Re: differentiate pdf and ppt
by Anonymous Monk on Aug 31, 2011 at 10:45 UTC
Re: differentiate pdf and ppt
by bart (Canon) on Aug 31, 2011 at 11:37 UTC
    If you read the first few bytes of a PDF file, it always starts with "%PDF" and the version number. I think that is all you need.
      Hi bart, Can you please tell me how can i do that.
        Sheesh, I would think that even any beginner in Perl should be able to tackle this. The tasks you have to do:
        1. read the first few bytes of the file (at least 4 bytes) into a string
        2. see if the string starts with '%PDF'

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://923394]
Approved by ww
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (2)
As of 2023-09-25 10:39 GMT
Find Nodes?
    Voting Booth?

    No recent polls found