Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Ranges in case insensitive regexps in unicode mode

by IlyaM (Parson)
on Jun 06, 2003 at 14:04 UTC ( [id://263674]=perlquestion: print w/replies, xml ) Need Help??

IlyaM has asked for the wisdom of the Perl Monks concerning the following question:

Can anybody please explain me why these two one liners omit different result?
ilya@juil:~$ perl -e 'print "b" =~ /[A-C]/i ? "true\n" : "false\n"' true ilya@juil:~$ perl -Mutf8 -e 'print "b" =~ /[A-C]/i ? "true\n" : "false +\n"' false
I checked docs (i.e. perlunicode, perlre and utf8) but I didn't notice anything which would explain such behavior.

Knowing that unicode support in Perl is very new and changes with each new release I guess it is worth to mention that I still use 5.6.1.

--
Ilya Martynov, ilya@iponweb.net
CTO IPonWEB (UK) Ltd
Quality Perl Programming and Unix Support UK managed @ offshore prices - http://www.iponweb.net
Personal website - http://martynov.org

Replies are listed 'Best First'.
Re: Ranges in case insensitive regexps in unicode mode
by broquaint (Abbot) on Jun 06, 2003 at 14:08 UTC
    'tis a bug in 5.6.1 and its less than sturdy unicode support which has been corrected in 5.8
    shell> perl5.8.0 -Mutf8 -le 'print "b" =~ /[A-C]/i ? "true" : "false"' true shell> perl5.6.1 -Mutf8 -le 'print "b" =~ /[A-C]/i ? "true" : "false"' false

    HTH

    _________
    broquaint

Re: Ranges in case insensitive regexps in unicode mode
by jmcnamara (Monsignor) on Jun 06, 2003 at 14:10 UTC

    It looks like the behaviour changed (i.e. was fixed) between 5.6 and 5.8:
    $ perl5.6.0 -Mutf8 -e 'print "b" =~ /[A-C]/i ? "true\n" : "false\n +"' false $ perl5.8.0 -Mutf8 -e 'print "b" =~ /[A-C]/i ? "true\n" : "false\n +"' true

    --
    John.

Re: Ranges in case insensitive regexps in unicode mode
by december (Pilgrim) on Jun 06, 2003 at 22:55 UTC

    # perl -Mutf8 -e 'print "b" =~ /[A-C]/i ? "true\n" : "false\n"' false # perl -v This is perl, v5.6.1 built for i386-linux -- # perl -Mutf8 -e 'print "b" =~ /[A-C]/i ? "true\n" : "false\n"' true # perl -v This is perl, v5.8.0 built for i386-openbsd

    I advise you to update to perl 5.8.0, because I have noticed similar errors between utf and other charset conversions/comparisions functions in perl 5.6.1. Some things just don't seem to work as expected. I'm not an utf or perl expert, but 5.8.0 seems to be more consistent, so if you have to do a lot of utf/charset relevant things...


       december

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://263674]
Approved by broquaint
Front-paged by VSarkiss
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (5)
As of 2024-04-19 03:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found