in reply to Re^2: How to determine if string contains non-ASCII characters ?
in thread How to determine if string contains non-ASCII characters ?
However, both regexes still don't appear to be hitting the spot. I've created a small test program which pulls an Arabic title from a webpage to demonstrate:
When run, I'd expect to see the result as "Contains non-ASCII", but instead I get "Contains ASCII only"use LWP::UserAgent; $ua = LWP::UserAgent->new; my $resp = $ua->get("http://www.englishlink.com/index_ARE_HTML.asp"); if ($resp->is_success) { $mystring = $resp->content; $mystring =~ s/.*\<title\>//sgi; $mystring =~ s/\<.*//sgi; } print "$mystring\n"; if ($mystring =~ m/[^\x00-\x7f]/) { print "Contains ASCII only\n"; } else { print "Contains non-ASCII\n"; }
Any thoughts as to why ?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: How to determine if string contains non-ASCII characters ?
by ikegami (Patriarch) on Aug 19, 2009 at 17:54 UTC | |
|
Re^4: How to determine if string contains non-ASCII characters ?
by moritz (Cardinal) on Aug 19, 2009 at 17:30 UTC | |
|
Re^4: How to determine if string contains non-ASCII characters ?
by roadrunner (Acolyte) on Aug 19, 2009 at 17:38 UTC |