vivapl has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks...

Is there a way of detecting whether a file is of binary type before opening it?
My script receives a text file with a list of files that need to be processed, however sometimes they're binary.

Any hints would be greatly appreciated.

Thanks... Ok, to be more specific, a seperate process collects logs from computerized machines.
The logs are in text format, however every so often the logs contain 'garbage'(this is a bug in the machine's OS).
When using the conventional 'grep' to search for a string, it reports back as a binary file.

Since my script is part of an automized process, I need a way to discard such logs containing garbage.

Replies are listed 'Best First'.
Re: Some binary wisdom needed
by Roy Johnson (Monsignor) on May 28, 2004 at 15:05 UTC
    You might want to look at the file test operators -T and -B. They have to open the files to look at them, of course.

    The PerlMonk tr/// Advocate
Re: Some binary wisdom needed
by calin (Deacon) on May 28, 2004 at 15:15 UTC

    Quick answer: see the -T and -B file tests. Read the documentation (quick quote in one of my previous posts: Re: How reliable is -T as a test for ASCII files?) to see how they work.

    "before opening" is really "before explicit opening by you". Any tests need to open the file to examine its contents.

    Update: Roy Johnson above is a quicker typist than me :)

Re: Some binary wisdom needed
by BrowserUk (Patriarch) on May 28, 2004 at 14:38 UTC
    ... however sometimes they're binary.

    How do you know? That is to say, what is the difference between two files that causes you to conclude one is binary and the other not?

    It probably sounds like a trick question, but it isn't. Giving you a good answer depends on your explaining this.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
Re: Some binary wisdom needed
by Anonymous Monk on May 28, 2004 at 14:41 UTC
    Is there a way of detecting whether a file is of binary type before opening it?

    Before opening, eh? Sure!

    1. Remove you shoes
    2. Dip a toe into the file
    3. Look at you toe. If you see a bunch of Zeroes and Ones, then it's binary

    HTH

Re: Some binary wisdom needed
by EdwardG (Vicar) on May 28, 2004 at 14:56 UTC

    All files are binary. What makes the difference is your assumptions about them, and how you treat them.

    I'm guessing you have some code that fails when you encounter unexpected bytes in your file, perhaps outside the range [0-9A-Za-z] plus the usual ensemble of CR, LF, TAB, and punctuation symbols.

    If you can answer BrowerUK's question - and I encourage you to give it some serious thought - then I also think you will be very close to answering your own question.

    I would also note that while you might be able to guess that a file contains undesirable characters (your 'binary' files), you will not know for sure until you open it and look inside.

     

Re: Some binary wisdom needed
by vivapl (Acolyte) on May 28, 2004 at 16:27 UTC
    Thanks, -B option works great. My problem is gone.
      Really? -B opens the file