I read somewhere that I can UTF-8 by specifying it at the beginning of the document use utf8;
use utf8; only specifies that the source is UTF-8. If you're reading data from a file, for example, you'll still need to decode that.
open(my $fh, '<:encoding(UTF-8)', $qfn) or die("Can't open file \"$qfn\": $!\n");
Don't forget to encode your output.
s/-/\x{2014}/g; This should turn a hyphen into an em dash correct?
Yes.
\x{2014} works even without use utf8;. It refers to character U+2014, no matter which encoding was used for the source.
The problem is, I only want to do the substitutions on the hyphens which are surrounded by 3 digits on both sides.
The approach you are taking require captures:
s/([0-9]{3})-([0-9]{3})/$1\x{2014}$2/g
But captures aren't needed here.
s/(?<=[0-9]{3})-(?=[0-9]{3})/\x{2014}/g
(\d matches some pretty funky stuff in addition to 0-9)
The latter snippet has the advantage of properly handling 123-456-789.
In reply to Re: match substitution
by ikegami
in thread match substitution
by ShayShay
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |