in reply to Locale Woes...

Something like the following may help (encode_monk and decode_monk are only there to save you potential character set problems with your text editor -- don't let them distract you). Cheers.
use strict; use warnings; use Encode 'decode_utf8'; my $suspect = decode_utf8( decode_monk('v~C3~A4hicule') ); warn '$suspect='.$suspect; if ($suspect =~ /^([\p{IsAlnum}]+)$/) { warn "MATCHES $1\n"; } exit( 0 ); ##------------------------------------------------------------------+ ## WARNING: Using sprintf here is too expensive. A ## lookup table may be the best solution. sub encode_monk { my ($txt) = @_; join q(), map { ( $_ >= 0x20 && $_ <= 0x7D ) ? ( sprintf "%c", $_ ) : ( sprintf "~%02X", $_ ) } unpack "C*", $txt; } ##------------------------------------------------------------------+ ## WARNING: Consider err policy here. sub decode_monk { my ($code) = @_; if (! defined $code or length( $code ) == 0) { return; } $code =~ s/\~([\da-fA-F]{2})/chr( hex( $1 ) )/eg; return $code; }