harangzsolt33, further to ++Fletch's spot-on reply, just a bit more detail, in case you find it useful.
From perldoc -X file test, note that the -T/-B file test operators (to tell if a file is an ASCII or UTF-8 text file or not) is done via a heuristic guess as follows:
The first block or so of the file is examined to see if it is valid UTF-8 that includes non-ASCII characters. If so, it's a -T file. Otherwise, that same portion of the file is examined for odd characters such as strange control codes or characters with the high bit set. If more than a third of the characters are strange, it's a -B file; otherwise it's a -T file.
Also, any file containing a zero byte in the examined portion is considered a binary file. (If executed within the scope of a use locale which includes LC_CTYPE, odd characters are anything that isn't a printable nor space in the current locale.) If -T or -B is used on a filehandle, the current IO buffer is examined rather than the first block. Both -T and -B return true on an empty file, or a file at EOF when testing a filehandle. Because you have to read a file to do the -T test, on most occasions you want to use a -f against the file first, as in: next unless -f $file && -T $file.
See also:
References added later
👁️🍾👍🦟
In reply to Re^2: Shotgun.pl - Shoots Holes in Files (Files References)
by eyepopslikeamosquito
in thread Shotgun.pl - Shoots Holes in Files
by BlueSquare23
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |