... the _utf8_on() described in 'perldoc Encode' doesn't seem to exist.
It exists, but is not exported by default; either declare that you want this exported on the "use Encode" line, or else qualify the call with the package name:
Encode::_utf8_on( $string ); # sets utf8 flag on $string
Likewise for the "is_utf8( $string )" function.
As mentioned previously, ASCII is a proper subset of utf8; a key feature of utf8's design is that every plain ascii text file is, by definition, a working utf8 file.
Putting a BOM at the start of ASCII data is silly, but if a text file really does contain wide (non-ascii) unicode characters, which will be 2 or 3 bytes long in utf8, an initial BOM can be sort of a handy signature to put at the start of the file, to give users or apps a "heads up" about what the file contains. (It'll show up as the three-byte sequence "0xEF 0xBB 0xBF", which is the utf8 rendering of the 16-bit unicode value U+FEFF.) Still, it is technically unnecessary for utf8 in any case. |