in reply to pdf and ppt to text

I would try to make ppt produce pdf and then process the pdfs.

you haven't specified which your "expected results" are, so I presume you need not only the text but also positional informations:

So please see Parsing PDFs by text position? and the referenced older threads for various approaches.

Cheers Rolf