in reply to Removing multibyte UTF-8 chars from strings
You don't show us where the string is initialized.
If you have the string verbatim in your editor, you might want to save the file with the UTF-8 encoding and then use utf8; at the top. Personally, I prefer to use charnames ':full'; and then write the characters using \N{...} named escapes.
As for the replacement target, you also need to tell/show us where you get it from, and you need to tell Perl what encoding the string is in. Maybe/most likely, the string already is UTF-8 but Perl doesn't know it. Then you should tell it to Perl by using:
use Encode 'decode'; ... my $string = decode('UTF-8', $input_string); # Keep only what we want: $string =~ m!([a-zA-Z0-9]+)! or warn "Invalid/empty username in '$string'"; my $real_user = $1; # Remove stuff we don't want, especially the writing direction isolate +s: $string =~ s!\x{2066}|\x{2069}!!g;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Removing multibyte UTF-8 chars from strings
by cormanaz (Deacon) on Jan 10, 2022 at 19:27 UTC |