use utf8;

Jaap has asked for the wisdom of the Perl Monks concerning the following question:

When i 'use utf8', this pice of code breaks:

my $hexString = "#00FF00";
my $darkHexString = '#';
while ($hexString =~ m/([0-9a-f]{2})/ig)
{
  $darkHexString .= sprintf "%02lx", int(hex($1)/2);
}
print "$darkHexString\n";
[download]

result without utf8: #007f00
result with utf8: #7f80

Would this be a bug or a feature?
Does anybody have an idea how to work around it?

Comment on use utf8; Download Code

Replies are listed 'Best First'.
Re: use utf8; by pg (Canon) on Nov 28, 2002 at 16:05 UTC
What you want is byte-semantics of your regexp, instead of the char-semantics, which utf8 will force. What you can do is to put 'use utf8' in smaller scopes. For example, in your case, put use utf8 within your while loop, it should then be fine: (This is for unix, if someone is working with activestate Perl 5.8.0, this caution is not neccessary.) `my $hexString = "#00FF00"; my $darkHexString = '#'; while ($hexString =~ m/([0-9a-f]{2})/ig) { use utf8; $darkHexString .= sprintf "%02lx", int(hex($1)/2); } print "$darkHexString\n";` [download]	[reply] [d/l]
Re: Re: use utf8; by Jaap (Curate) on Nov 28, 2002 at 19:32 UTC
Is it also possibe to match two hex chars in char-semantics?	[reply]
Re: use utf8; by Jaap (Curate) on Nov 28, 2002 at 15:33 UTC
On further investigation, it is the regular expression that doesn't work as before. How do i match two characters under utf8?	[reply]