in reply to Re: Re: A regex that does this, but not that?
in thread A regex that does this, but not that?

any words that start with "t", end with "t", but do not contain any other "t"s within

OK so that's \bt[^t]+t\b -- word-boundary, then a t, then one or more other characters not a t, then a t, then a word boundary.

Apart from the abbreviation "tt" this should be fine.

So "tent", "tesseract", "tot", "tort" and "test" itself will match this pattern.

However, "testament" will fail it because of the "t" in the middle.

Then you need a special case for "test" itself, which you can do with the /e modifier and the ternary operator, as in pg's example above.

So something like this:

#!/usr/bin/perl -w use strict; my $words='test Buffy testament Anya tot Willow tesseract Faith tent'; $words =~ s/\b(t[^t]+t)\b/$1 eq "test" ? $1 : ''/ge; print $words; # prints 'test Buffy testament Anya Willow Faith';

Where the regex means "Find words matching t, something-not-t, then t at the end. Replace them with nothing, unless they're the word test, in which case, replace them with themselves".

You could replace the ternary thing with this more longwinded version if you liked:

$words =~ s/\b(t[^t]+t)\b/ my $temp = $1; if($temp eq 'test'){ $temp }else{ '' }/xge;


($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss') =~y~b-v~a-z~s; print

Replies are listed 'Best First'.
Re: Re: Re: Re: A regex that does this, but not that?
by Anonymous Monk on Nov 15, 2003 at 06:38 UTC

    Your character class of [^t] can itself cross word boundaries so that strings like: "this will be a problem right?" will be a problem, right?

      Quite right, very true, didn't think of that.

      What if we add a not-whitespace-either to the character class?

      #!/usr/bin/perl -w use strict; my $words='test Buffy testament Anya tot Willow tesseract Faith tent this is a problem right?'; $words =~ s/\b(t[^t\s]+t)\b/$1 eq "test" ? $1 : ''/ge; print $words;


      ($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss') =~y~b-v~a-z~s; print