in reply to Re^2: One bird, two Unicode names
in thread One bird, two Unicode names

So, one way to make the two strings equal would be to replace the Unicode apostrophe U+2019 found in the first string with the a ASCII single quote used in the second string:

$s1 =~ s/\x{2019}/'/g;

(just in case it's not obvious...)

Replies are listed 'Best First'.
Re^4: One bird, two Unicode names
by RCH (Sexton) on Mar 11, 2011 at 18:04 UTC
    Yes, thats how I've been doing it
    $editted_copy = $string; # Look for codepoints not in Basic Latin while ( $string =~ s/(\P{InBasic_Latin})// ) { my $U_char = $1; my $U_codepoint = ord($U_char); #and try to replace them if( defined( $subs{$U_codepoint} ) && exists( $subs{$U_codepoint} )){ $editted_copy =~ s/$U_char/$subs{$U_codepoint}/; } else{ #add the missing U_codepoint by hand to the %subs hash #and iterate till no more U_codepoints causing problems }

    (I was just hoping for something prettier)
    RichardH