element22 has asked for the wisdom of the Perl Monks concerning the following question:

I'm using Perl, but the issue is about regex. I want to match a word stem, on word boundaries, whether the word is just the stem (no suffix), or if there's an optional suffix from this group: s|'s|n . The second element is an apostrophe and an "s", I tried both the apostrophe alone and with a backslash. This is what I have but it doesn't match:

$string=~/\bwordstem(s|\'s|n)?\b/i;

So, given a word stem like "Russia", I want to match Russia, Russia's, Russias, Russian. The "wordstem" in the regex is the placeholder for the word stem ("Russia")

Replies are listed 'Best First'.
Re: Regex: match a word stem plus an optional suffix from a group
by LanX (Saint) on Jul 21, 2022 at 01:55 UTC
    Maybe I'm misunderstanding the problem ...

    ... but this works for me, if your "placeholder" is a variable starting with $ .

    > perl -de0 ... DB<1> @a = qw(Russia Russia's Russias Russian) DB<2> $wordstem = "Russia" DB<3> x grep { /\bwordstem(s|\'s|n)?\b/i } @a empty array DB<4> x grep { /\b$wordstem(s|\'s|n)?\b/i } @a 0 'Russia' 1 'Russia\'s' 2 'Russias' 3 'Russian'

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      Thanks, I found the typo: in my regex, the last suffix was "an" not just "n", so it didn't match "Russian". Now it does. Yes, my variable starts with a $. Now I also put it in squiggly brackets for clarity ${wordstem}.

        You should also escape interpolated stuff in regexen. \Q${wordstem}\E in this case. If you don’t, you can end up with really confusing bugs and, depending on Perl version, a malicious regex that can be a DoS attack. I would encourage you to use /x to improve readability. Something like–

        / \b \Q${stem}\E (?: s | 's | n )? \b /xi