Re: Re: Re: A regex that does this, but not that?

any words that start with "t", end with "t", but do not contain any other "t"s within

OK so that's \bt[^t]+t\b -- word-boundary, then a t, then one or more other characters not a t, then a t, then a word boundary.

Apart from the abbreviation "tt" this should be fine.

So "tent", "tesseract", "tot", "tort" and "test" itself will match this pattern.

However, "testament" will fail it because of the "t" in the middle.

Then you need a special case for "test" itself, which you can do with the /e modifier and the ternary operator, as in pg's example above.

So something like this:


#!/usr/bin/perl -w
use strict;

my $words='test Buffy testament Anya tot Willow tesseract
 Faith tent';
$words =~ s/\b(t[^t]+t)\b/$1 eq "test" ? $1  : ''/ge; 
print $words;
# prints 'test Buffy testament Anya  Willow  Faith';
[download]

Where the regex means "Find words matching t, something-not-t, then t at the end. Replace them with nothing, unless they're the word test, in which case, replace them with themselves".

You could replace the ternary thing with this more longwinded version if you liked:

$words =~ s/\b(t[^t]+t)\b/
           my $temp = $1;
           if($temp eq 'test'){
             $temp
           }else{
             ''
           }/xge;
[download]

($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss')
 =~y~b-v~a-z~s;                             print
[download]

Comment on Re: Re: Re: A regex that does this, but not that? Select or Download Code

Replies are listed 'Best First'.
Re: Re: Re: Re: A regex that does this, but not that? by Anonymous Monk on Nov 15, 2003 at 06:38 UTC
Your character class of `[^t]` can itself cross word boundaries so that strings like: "this will be a problem right?" will be a problem, right?	[reply] [d/l]
Re: Re: Re: Re: Re: A regex that does this, but not that? by Cody Pendant (Prior) on Nov 15, 2003 at 07:32 UTC
Quite right, very true, didn't think of that. What if we add a not-whitespace-either to the character class? `#!/usr/bin/perl -w use strict; my $words='test Buffy testament Anya tot Willow tesseract Faith tent this is a problem right?'; $words =~ s/\b(t[^t\s]+t)\b/$1 eq "test" ? $1 : ''/ge; print $words;` [download] `($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss') =~y~b-v~a-z~s; print` [download]	[reply] [d/l] [select]