in reply to words wich ends with "f"

but if i want to know the words which ends with "f"
Mach exactly what you want, not more. A . matches non-word characters, which is bad in your case. Try \b\w*f\b instead.
also i want to know the words wich does not begin with "f" but have at least one "f" in the remainder of the word:
\b[^f]\w*f\w*\b might be worth a try, but I'm not convinced it's strict enough. Maybe \b(?=\w)[^f]\w*f\w*\b is right for you

(All examples here untested).

Update: ysth++ pointed out that (?=\w)[^f] is better written as [^f\W] - and indeed it is, if you consider only ASCII semantics. When you use Unicode, \w is not the same as [^\W].

Replies are listed 'Best First'.
Re^2: words wich ends with "f"
by ikegami (Patriarch) on Jun 19, 2008 at 17:38 UTC

    When you use Unicode, \w is not the same as [^\W].

    Really? Can you give an example?

      Really? Can you give an example?

      No, I tried to find one, and failed.

      I was referring to this section of perlunicode.

      (However, and as a limitation of the current implementation, using "\w" or "\W" inside a "[...]" character class will still match with byte semantics.)

      (taken from the perl 5.8.8 perlunicode man page). I tried to find an example for that with this script:

      use strict; for (1 .. 1e8){ eval { if ((chr($_) =~ m/\W/) xor (chr($_) =~ m/[^\w]/)){ print "Counter-example with chr($_)\n"; } } }

      But it didn't find anything.

      Did I misunderstood the docs? Or are the docs wrong/out of date?

        I read it the same way you do.

        use strict; use warnings; use Test::More qw( no_plan ); my $ch = chr(0xE9); # lowercase e acute # Byte semantics utf8::downgrade($ch); ok( $ch !~ /\w/ ); ok( $ch =~ /\W/ ); ok( $ch !~ /[\w]/ ); ok( $ch =~ /[\W]/ ); # Unicode semantics utf8::upgrade($ch); ok( $ch =~ /\w/ ); ok( $ch !~ /\W/ ); ok( $ch =~ /[\w]/ ); # Should fail according to the docs. ok( $ch !~ /[\W]/ ); # Should fail according to the docs.

        [\w] seems to work correctly (as in contrary to the docs), but not [\W].

        >c:\progs\perl588\bin\perl test.pl ok 1 ok 2 ok 3 ok 4 ok 5 ok 6 ok 7 not ok 8 # Failed test in test.pl at line 20. 1..8 # Looks like you failed 1 test of 8.

        Same in 5.10.0.

        >c:\progs\perl5100\bin\perl test.pl ok 1 ok 2 ok 3 ok 4 ok 5 ok 6 ok 7 not ok 8 # Failed test at test.pl line 20. 1..8 # Looks like you failed 1 test of 8.

        Although the bit you quoted was removed from from the docs.

        Update: It might help if I actually used a character class in my tests. Fixed.