in reply to Re: Guess between UTF8 and Latin1/ISO-8859-1
in thread Guess between UTF8 and Latin1/ISO-8859-1

<off_topic>If it is that easy, how come my MS Internet Explorer miserably fails to automatically recognize the fact that some files are Unicode and I get all kinds of weird characters on my screen?</off_topic>

CountZero

"If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

  • Comment on Re: Re: Guess between UTF8 and Latin1/ISO-8859-1

Replies are listed 'Best First'.
Re: Re: Re: Guess between UTF8 and Latin1/ISO-8859-1
by iburrell (Chaplain) on Jan 21, 2004 at 23:26 UTC
    Probably because Microsoft stopped the insanity at examining the whole file for character set instead of just examining it for the content type. Not to mention the difficult in trying to figure out the encoding automatically. There is a big difference between "this is invalid UTF-8 so it must Latin1" and "this weird stuff must be EUC-KR".

    Not to mention, saying a file is Unicode does not specify the encoding. There are multiple encodings for Unicode, and most non-Unicode encodings can be mapped to Unicode, as long as they are declared.

Re: Re: Re: Guess between UTF8 and Latin1/ISO-8859-1
by allolex (Curate) on Jan 21, 2004 at 21:29 UTC

    I don't think they're using Perl on IE. That pretty much explains everything. ;) Actually, those pages would work right if people bothered declaring which encoding they're using. So many standards... so little compliance.

    --
    Allolex