For some reason, M$ apps seem to have adopted the use of a file-initial BOM character to signal that a "plain-text" data file contains utf8-encoded unicode characters. If a file contains utf8 wide characters without the initial BOM, apps like wordpad, etc, will misinterpret the wide characters as something else. And maybe "visual studio" is insisting that a file be "marked as containing utf8" even when it doen't need to include wide characters...open( OUT, ">:utf8", "a.txt" ) or die "a.out: $!"; print OUT "\x{feff}aaaa\n"; close OUT;
(Of course, the BOM was originally intended to be of use only in UTF16-encoded unicode data files, to indicate the "endian-ness" (byte-order) of the 16-bit data, and it shouldn't really be needed at all in a utf8-encoded file, because utf8 is not affected by big-endian vs. little-endian byte-order. But a number of applications -- particularly M$ apps that are able to handle plain-text files along with their rogues-gallery of "application-specific file formats" -- have inexplicably come to depend on a utf8-encoded BOM at the start of the file, acting like a sort of "magic number" to let them know that they are looking at a utf8-encoded file.)
In reply to Re: How can I create an UTF8 encoded txt file contains strings like "aaaa"?
by graff
in thread How can I create an UTF8 encoded txt file contains strings like "aaaa"?
by sinbao
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |