in reply to Distinguishing text from binary data
You could look at the perl implementation of the unix file utility. It is(was) part of the PPT project.