in reply to find junk file

If by "junk file" you mean "a file consisting of random characters", then you could use a spelling checker and run the file through that. Then, if it finds, say, more than 50 errors, you classify it as a "junk file".
Is that what you mean?

~Thomas~
confess( "I offer no guarantees on my code." );

Replies are listed 'Best First'.
Re^2: find junk file
by shan_emails (Beadle) on Jun 07, 2012 at 07:12 UTC

    If the file having the contents other than the keyboard characters then we say it as junk file

    For example of junk file, if we rename any zip file or Microsoft Excel/powerpoint file into txt format, then open this txt file, we can see many junked contents from this consider as junked file.

      Maybe you want to determine whether a file is a "text file" as opposed to a "binary file"? See -X for the -B and -T operators. Also note that UTF-8 encoded "text" files may look like "binary" files, depending on what kind of letters are on your keyboard. Also see http://www.daskeyboard.com/

      Oh, well, in that case, it's quite simple:

      use constant HIGHEST_CHAR_ON_KBD => 126, #These values may differ for +you, depending on where you bought LOWEST_CHAR_ON_KBD => 9; # your keyboard. There are so +me extra, non-keyboard chars in this range, as well. while( <FILE> ) { foreach( split("", $_) ) { if( ( ord($_) > HIGHEST_CHAR_ON_KBD ) || ( ord($_) < LOWEST_CH +AR_ON_KBD) ) { say "It's a binary file"; last; } } }

      It isn't the best way of doing things, but it's a start.
      Update: I completely forgot about spaces, tabs, carriage returns, and line feeds.

      ~Thomas~
      confess( "I offer no guarantees on my code." );

        thomas895:

        So a file is a text file unless someone uses a space?

            ...or a tab?

            ...or a carriage return?

            ...or a newline, escape sequence, ....?

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.