in reply to Regex (lookahead) Confusion

while (<DATA>) { chomp; if (! /([smtwhfa])(?=.*?\1)/) { print "$_ : OK\n"; } else { print "$_ : Not OK\n"; } } __DATA__ smsa smta stmwhas
And the output -
smsa : Not OK smta : OK stmwhas : Not OK

Update: Ah! Thanks people for pointing out that my regex didn't check for non-allowed characters. bart's solution is perfect.

Replies are listed 'Best First'.
Re: Re: Regex (lookahead) Confusion
by allolex (Curate) on Feb 05, 2004 at 21:04 UTC

    This is a really good effort, but it fails if characters not matching the given character set are included in the string.

    Unfortunately, I don't have a solution, either because in my mind, there would have to be a lookbehind on a string of variable length, which (of course) will not work.

    #!/usr/bin/perl use strict; use warnings; while(<DATA>) { chomp; if (! /([smtwhfa])(?=.*?\1)/) { print "$_ : OK\n"; } else { print "$_ : Not OK\n"; } } __DATA__ swma smqa smsa fhtm ttma t2ms __END__ swma : OK smqa : OK smsa : Not OK fhtm : OK ttma : Not OK t2ms : OK

    --
    Allolex

Re: Re: Regex (lookahead) Confusion
by Anonymous Monk on Feb 05, 2004 at 21:01 UTC

    Not quite what the OP wanted. It was required that the word was completely made of allowed characters only. Try this:

    while (<DATA>) { chomp; if (! /([smtwhfa])(?=.*?\1)/) { print "$_ : OK\n"; } else { print "$_ : Not OK\n"; } } __DATA__ smsa smta stmwhas BADsmtaEXAMPLE

    And the output

    smsa : Not OK smta : OK stmwhas : Not OK : OK BADsmtaEXAMPLE : OK

      Except that this appears to be exactly the same code as Roger used.

      --
      Allolex

        True. The point is that he didn't test it with the right data.

        I was saying that his code will give wrong results for the two lines I added, just to prove my assertion.

Re: Re: Regex (lookahead) Confusion
by flyingmoose (Priest) on Feb 05, 2004 at 21:12 UTC
    Roger, can you please explain this a bit to us mere regex mortals? Thanks!
      if (! /([smtwhfa])(?=.*?\1)/) { | | | | | +----- followed by this character | +--------------- find any of these characters +-------------------- but take the negative of the match

      bart's regex below is much better, see his reply and follow-up's.