in reply to How to recognize Word and XLS files

Office Binary File Formats describes the file formats of Word, Excel and Powerpoint. You can always open the file to be tested in binary mode and check if the structure is one of these (or other MS) files.

The key is in the "File Information Block".

For Word-files, there is an ident structure in the first 32 bytes of the file. If the first two bytes are 0xA5EC, then the file is a Word-file (FibBase-structure).

Now the only problem is to know where this File Information Black starts. It may be at the very beginning of the file, but I am not sure ...

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

My blog: Imperial Deltronics
  • Comment on Re: How to recognize Word and XLS files