cormanaz has asked for the wisdom of the Perl Monks concerning the following question:
but I see this forum is replacing the characters with some other encoding I don't recognize. In my editor (Komodo) They are at the beginning and end of the $foo string. The beginning one is a dotted box with LRI inside, and the end one has the same box with PDI inside. The script returns:use feature ':5.10'; my $foo = '⁦JenAFifield⁩'; foreach my $i (0..length($foo)) { $char = substr($foo,$i,1); $charnum = ord($char); say "$char\t$charnum"; }
where in my editor the undisplayable char is a black rectangle with HOP inside. I'm not sure why it displays like that because ASCII 129 should be u-umlaut.226 129 166 J 74 e 101 n 110 A 65 F 70 i 105 f 102 i 105 e 101 l 108 d 100 226 129 169
Am wondering how to do a regexp that will get rid of these chars. Based on numbers shown here I tried $foo =~ s/\x8294|\x8297//g; but that didn't do it. Can anyone help?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Removing multibyte UTF-8 chars from strings
by Corion (Patriarch) on Jan 10, 2022 at 18:18 UTC | |
by cormanaz (Deacon) on Jan 10, 2022 at 19:27 UTC |