in reply to Re: words wich ends with "f"
in thread words wich ends with "f"

When you use Unicode, \w is not the same as [^\W].

Really? Can you give an example?

Replies are listed 'Best First'.
Re^3: words wich ends with "f"
by moritz (Cardinal) on Jun 19, 2008 at 20:45 UTC
    Really? Can you give an example?

    No, I tried to find one, and failed.

    I was referring to this section of perlunicode.

    (However, and as a limitation of the current implementation, using "\w" or "\W" inside a "[...]" character class will still match with byte semantics.)

    (taken from the perl 5.8.8 perlunicode man page). I tried to find an example for that with this script:

    use strict; for (1 .. 1e8){ eval { if ((chr($_) =~ m/\W/) xor (chr($_) =~ m/[^\w]/)){ print "Counter-example with chr($_)\n"; } } }

    But it didn't find anything.

    Did I misunderstood the docs? Or are the docs wrong/out of date?

      I read it the same way you do.

      use strict; use warnings; use Test::More qw( no_plan ); my $ch = chr(0xE9); # lowercase e acute # Byte semantics utf8::downgrade($ch); ok( $ch !~ /\w/ ); ok( $ch =~ /\W/ ); ok( $ch !~ /[\w]/ ); ok( $ch =~ /[\W]/ ); # Unicode semantics utf8::upgrade($ch); ok( $ch =~ /\w/ ); ok( $ch !~ /\W/ ); ok( $ch =~ /[\w]/ ); # Should fail according to the docs. ok( $ch !~ /[\W]/ ); # Should fail according to the docs.

      [\w] seems to work correctly (as in contrary to the docs), but not [\W].

      >c:\progs\perl588\bin\perl test.pl ok 1 ok 2 ok 3 ok 4 ok 5 ok 6 ok 7 not ok 8 # Failed test in test.pl at line 20. 1..8 # Looks like you failed 1 test of 8.

      Same in 5.10.0.

      >c:\progs\perl5100\bin\perl test.pl ok 1 ok 2 ok 3 ok 4 ok 5 ok 6 ok 7 not ok 8 # Failed test at test.pl line 20. 1..8 # Looks like you failed 1 test of 8.

      Although the bit you quoted was removed from from the docs.

      Update: It might help if I actually used a character class in my tests. Fixed.