in reply to Language recognition of web pages

It is easy to determine the default charset when it isn't marked because it is specified in the standards. For MIME, and text content types, the defualt charset is US-ASCII (RFC 2046). For HTTP, the default is ISO-8859-1 (RFC 2616).

However, most browsers will interpret the charset based on local settings. I can set Firefox to use any encoding as the default and read unmarked Shift_JIS files if I want.