in reply to replace multiplication symbol (×)
G'day michael.kitchen,
Welcome to the Monastery.
You may need additional code to handle encoding. Without any context to the one line of code you posted, it's hard to make recommendations. When dealing with "UTF-8" — as part of input, output or source code — I generally find these two lines, near the top of my code, handle most cases.
use utf8; use open IO => qw{:encoding(utf8) :std};
See the open and utf8 pragmata; the binmode function; and any of the perldoc pages with names starting with perluni (choose as appropriate for your Perl/Unicode knowledge: there's lots from introductory to advanced levels).
In the example (one-liner) code I provide below, I've used this alias:
$ alias perlu alias perlu='perl -Mstrict -Mwarnings -Mautodie=:all -Mutf8 -C -E'
See perlrun for any options you're unfamiliar with.
If you thought to check for '×', you should probably also check for other formats that they might appear in: '×', '×' and '×'. In fact, you should probably check that the character really is "\N{MULTIPLICATION SIGN}", because a lot of characters, like this selection, look very similar to that:
$ perlu 'my @x = qw{ x × ⨉ 🗙 }; say sprintf "U+%06X", ord for @x'
U+000078
U+0000D7
U+002A09
U+01F5D9
Those are (using the builtin module Unicode::UCD):
$ perlu 'use Unicode::UCD "charinfo"; my @x = qw{ x × ⨉ 🗙 }; say sprintf "U+%06X : %s", ord($_), charinfo(ord $_)->{name} for @x'
U+000078 : LATIN SMALL LETTER X
U+0000D7 : MULTIPLICATION SIGN
U+002A09 : N-ARY TIMES OPERATOR
U+01F5D9 : CANCELLATION X
Anyway, when you have determined the character, both substitution (s///) and transliteration (y///) should work just fine. Transliteration is faster, if that matters to you (see "perlperf: BENCHMARKS: Search and replace or tr"). Here's some examples:
$ perlu 'my $x = "|\N{MULTIPLICATION SIGN}|"; say $x; say $x =~ s/\N{MULTIPLICATION SIGN}//r'
|×|
||
$ perlu 'my $x = "|\N{MULTIPLICATION SIGN}|"; say $x; say $x =~ y/\N{MULTIPLICATION SIGN}//dr'
|×|
||
$ perlu 'my $x = "|\N{MULTIPLICATION SIGN}|"; say $x; say $x =~ s/\x{d7}//r'
|×|
||
$ perlu 'my $x = "|\N{MULTIPLICATION SIGN}|"; say $x; say $x =~ y/\x{d7}//dr'
|×|
||
$ perlu 'my $x = "|\N{MULTIPLICATION SIGN}|"; say $x; say $x =~ s/×//r'
|×|
||
$ perlu 'my $x = "|\N{MULTIPLICATION SIGN}|"; say $x; say $x =~ y/×//dr'
|×|
||
[Note: Although we generally prefer code and data within <code>...</code> tags, when posting Unicode, <pre>...</pre> tags will show the actual characters (instead of entity references like 🗙). The downside of using <pre> is that you have to manually format special characters ('<' to '<', '&' to '&', and so on) and you don't get a "Download" link.]
Finally, I'm using Perl 5.26, which supports Unicode 9.0. If you have an earlier Perl version, it will support an earlier Unicode version, which may give you different results to the ones I've shown. I posted a discussion about this a couple of months ago: "Re: printing Unicode works for some characters but not all".
— Ken
|
|---|