in reply to Counting chars in a HTML-page
Is the html valid? If not, is it at least well formed?
wget -O - 'http://www.somewhere.tld/' | wc