in reply to constants in an RE

G'day BernieC,

My code looks something like:

use constant APOSTROPHE => "\xE2\x80\x99" ; my $string =~ s/APOSTROPHE/'/ ;

Unlikely. Probably closer to:

use constant APOSTROPHE => "\xE2\x80\x99" ; my $string = "wjzw’xl cowuxze."; $string =~ s/APOSTROPHE/'/ ;
"it doesn't seem to be working"

That's a useless error report which, after 16 years and 850 posts, you really should know.

You've been shown an impressive number of hoops that you could jump through to force the constant into a regex. Perl's string handling functions do not have the problem you've encountered with a regex; for instance,

my $pos = index $string, APOSTROPHE; substr($string, $pos, length APOSTROPHE) = q{'};

[If it's important to you, they're probably also faster (Benchmark to check).]

You should probably handle the potential case of a string containing more than one "". If you have to deal with more than one string, you should abstract the solution into a subroutine so that you only need to code it once. If you have this requirement in more than one script, you should put that subroutine into a module so that all scripts can share the one solution.

Here's an example of what that subroutine might look like. Note that I've done away with the constant altogether; however, if you felt that its inclusion was necessary, you could use index() and substr() as I've shown above.

#!/usr/bin/env perl use 5.010; use strict; use warnings; use utf8; use open OUT => qw{:encoding(UTF-8) :std}; my $string1 = 'abc’def’ghi’jkl'; my $string2 = 'mno’pqr’stu’vwx'; say '*** BEFORE ***'; say "\$string1[$string1]"; say "\$string2[$string2]"; ($string1, $string2) = @{fancy_apos_to_ascii_apos(\($string1, $string2))}; say '*** AFTER ***'; say "\$string1[$string1]"; say "\$string2[$string2]"; sub fancy_apos_to_ascii_apos { my (@fancies) = @_; state $fancy_apos = '’'; state $fancy_apos_len = length $fancy_apos; my $asciis = []; for my $string (map $$_, @fancies) { while ((my $pos = index $string, $fancy_apos) >= 0) { substr($string, $pos, $fancy_apos_len) = q{'}; } push @$asciis, $string; } return $asciis; }

Output:

*** BEFORE *** $string1[abc’def’ghi’jkl] $string2[mno’pqr’stu’vwx] *** AFTER *** $string1[abc'def'ghi'jkl] $string2[mno'pqr'stu'vwx]

— Ken