in reply to Check UTF8
I am only able to guess on your problem, as you gave very little detail. Do you want to keep lines that only use ASCII? Do you want to keep lines that are not UTF-8 but are valid Latin-1 or ISO-2022-JP or some other encoding?
If it really is a matter of ASCII or non-ASCII UTF-8, just reject a line if it includes any character above chr(127). Other encodings will present a bit more challenge.
--
[ e d @ h a l l e y . c c ]
|
---|