in reply to Re^2: Win32 and OCR via OLE
in thread Win32 and OCR via OLE

OCR will NOT help with .doc or .xls. Neither is an image for input to Optical Character Recognition.

Please read the replies from Corion and Marto as already posted... and see Marto's for .pdf

And if you're thinking about "pull(ing)...from (unknown) binary formats or image text" you better start thinking about how to deal with malware.