in reply to Unicode Woes
Run this (at least on my machine) and $translation has no visible contents, yet it has a length of 5!#!/usr/bin/perl use URI::Escape; use Encode; require LWP::UserAgent; my $escape = uri_escape(join('. ', @ARGV)); my $ua = LWP::UserAgent->new; my $response = $ua->get("http://babelfish.altavista.com/tr?trtext=$esc +ape&lp=en_ja"); if ($response->is_success) { $result = $response->content; # or whatever } else { die $response->status_line; } Encode::_utf8_on( $result ); my ($translation) = $result =~ /\Q<td bgcolor=white class=s><div style +=padding:10px;>\E(.+?)\Q<\/div>\E/; $original = $translation; $translation=~s/([^[:ascii:]])/sprintf("\\x{%.4x}",ord $1)/ge; print $translation ."\n". length($original) ."\n". ord(substr($origina +l,0,1));
If your machine gives you something sensible, please let me know.
(You can probably remove the Encode calls there .. that was just making sure that the resulting string *was* in utf8 according to perl)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Unicode Woes
by Anonymous Monk on Oct 01, 2004 at 11:37 UTC | |
|
Re^2: Unicode Woes
by graff (Chancellor) on Oct 01, 2004 at 22:57 UTC |