Re: Please help with Regexp::Common

You might try to trim the boundary assertions off of the stringized Regexp object (sorry for all the wrap-around):

c:\@Work\Perl\monks>perl -wMstrict -le
"use Regexp::Common;
 ;;
 print qq{$RE{profanity}};
 print qq{A: match '$1'} if 'xxxpissxxx' =~ m{ ($RE{profanity}) }xms;
 ;;
 print '--------';
 (my $erp = $RE{profanity}) =~ s{ \A \Q(?:\b\E (.*) \Q\b)\E \z }{$1}xm
+s;
 print qq{'$erp'};
 ;;
 print qq{B: match '$1'} if 'xxxpissxxx' =~ m{ ($erp) }xms;
"
(?:\b(?:(?:piss(?:\ take|\-take|take|e(?:rs|[srd])|ing|y)?|quims?|shit
+(?:t(?:e(?:rs|[dr])|ing|y)|e(?
:rs|[sdry])|ing|[se])?|t(?:urds?|wats?)|wank(?:e(?:rs|[rd])|ing|s)?|a(
+?:rs(?:e(?:\ hole|\-hole|hole|
[sd])|ing|e)|ss(?:\ holes?|\-holes?|ed|holes?|ing))|b(?:ull(?:\ shit(?
+:t(?:e(?:rs|[dr])|ing)|s)?|\-s
hit(?:t(?:e(?:rs|[dr])|ing)|s)?|shit(?:t(?:e(?:rs|[dr])|ing)|s)?)|low(
+?:\ jobs?|\-jobs?|jobs?))|c(?:
ock(?:\ suck(?:ers?|ing)|\-suck(?:ers?|ing)|suck(?:ers?|ing))|rap(?:p(
+?:e(?:rs|[rd])|ing|y)|s)?|u(?:
nts?|m(?:ing|ming|s)))|dick(?:\ head|\-head|ed|head|ing|less|s)|f(?:uc
+k(?:ed|ing|s)?|art(?:e[rd]|ing
|[sy])?|eltch(?:e(?:rs|[rsd])|ing)?)|ha(?:rd[\-\ ]?on|lf(?:\ a[sr]|\-a
+[sr]|a[sr])sed)|m(?:other(?:\
fuck(?:ers?|ing)|\-fuck(?:ers?|ing)|fuck(?:ers?|ing))|uth(?:a(?:\ fuck
+(?:ers?|ing|[aaa])|\-fuck(?:er
s?|ing|[aaa])|fuck(?:ers?|ing|[aaa]))|er(?:\ fuck(?:ers?|ing)|\-fuck(?
+:ers?|ing)|fuck(?:ers?|ing)))|
erde?)))\b)
--------
'(?:(?:piss(?:\ take|\-take|take|e(?:rs|[srd])|ing|y)?|quims?|shit(?:t
+(?:e(?:rs|[dr])|ing|y)|e(?:rs|
[sdry])|ing|[se])?|t(?:urds?|wats?)|wank(?:e(?:rs|[rd])|ing|s)?|a(?:rs
+(?:e(?:\ hole|\-hole|hole|[sd]
)|ing|e)|ss(?:\ holes?|\-holes?|ed|holes?|ing))|b(?:ull(?:\ shit(?:t(?
+:e(?:rs|[dr])|ing)|s)?|\-shit(
?:t(?:e(?:rs|[dr])|ing)|s)?|shit(?:t(?:e(?:rs|[dr])|ing)|s)?)|low(?:\ 
+jobs?|\-jobs?|jobs?))|c(?:ock(
?:\ suck(?:ers?|ing)|\-suck(?:ers?|ing)|suck(?:ers?|ing))|rap(?:p(?:e(
+?:rs|[rd])|ing|y)|s)?|u(?:nts?
|m(?:ing|ming|s)))|dick(?:\ head|\-head|ed|head|ing|less|s)|f(?:uck(?:
+ed|ing|s)?|art(?:e[rd]|ing|[sy
])?|eltch(?:e(?:rs|[rsd])|ing)?)|ha(?:rd[\-\ ]?on|lf(?:\ a[sr]|\-a[sr]
+|a[sr])sed)|m(?:other(?:\ fuck
(?:ers?|ing)|\-fuck(?:ers?|ing)|fuck(?:ers?|ing))|uth(?:a(?:\ fuck(?:e
+rs?|ing|[aaa])|\-fuck(?:ers?|i
ng|[aaa])|fuck(?:ers?|ing|[aaa]))|er(?:\ fuck(?:ers?|ing)|\-fuck(?:ers
+?|ing)|fuck(?:ers?|ing)))|erde
?)))'
B: match 'piss'
[download]

Update: Of course, this gets you right back to the Scunthorpe Problem noted above by Paladin!

Give a man a fish: <%-{-{-{-<

Comment on Re: Please help with Regexp::Common Select or Download Code

Replies are listed 'Best First'.
Re^2: Please help with Regexp::Common by scorpio17 (Canon) on Jan 19, 2017 at 15:50 UTC
I followed your suggestion and tried this: `use strict; use Regexp::Common; (my $reg = $RE{profanity}) =~ s{\A \Q(?:\b\E (.*) \Q\b)\E \z}{$1}xms; while ( my $word = <DATA> ) { chomp $word; if ( $word =~ m/$reg/ ) { print "Profanity detected: \"$word\"\n"; } else { print "$word\n"; } } __DATA__ aaaabbbbcccc aaaashitcccc aaaa1234cccc ddddeeeeffff` [download] This way it will find embedded "bad words" without the need for spaces around them, which is what I wanted. I realize the logic in requiring the word boundaries. But I think the fact that $RE{num}{int} finds embedded numbers made me assume that $RE{profanity} should work the same way, or else there might be a switch to toggle the behavior one way or the other. The reason I need this is to generate temporary (one-use) passwords (like when someone requests a password reset on a website). The generated password should, ideally, be a jumble of random letters and/or numbers, but I don't want to accidentally send someone a password with an "obvious" obscenity embedded, so a simple filter like this is helpful. Thanks!	[reply] [d/l]
Re^3: Please help with Regexp::Common by AnomalousMonk (Archbishop) on Jan 19, 2017 at 18:11 UTC
You might consider adding a test to check if the expected alteration to the original regex was successful. The `\Q(?:\b\E` and `\Q\b)\E` parts of the substitution are rather fragile IMO and may break if the maintainer(s) of Regexp::Common ever change his/her/their notion of what a proper profane regex should look like. `c:\@Work\Perl\monks>perl -wMstrict -le "use Regexp::Common; ;; (my $reg = $RE{profanity}) =~ s{\A \Q(?:\b\E (.*) \Q\b)\E \z}{$1}xms or die 'profanity anchor trim failed'; ;; print qq{bad: '$1'} if 'Matsushita' =~ m{ ($reg) }xms; " bad: 'shit'` [download] Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^3: Please help with Regexp::Common by Mr. Muskrat (Canon) on Jan 19, 2017 at 16:14 UTC
Shouldn't you be generating passwords that do not contain any words?	[reply]
Re^4: Please help with Regexp::Common by afoken (Chancellor) on Jan 19, 2017 at 18:38 UTC
Shouldn't you be generating passwords that do not contain any words? https://xkcd.com/936/ Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply]
Re^5: Please help with Regexp::Common by Mr. Muskrat (Canon) on Jan 19, 2017 at 18:41 UTC
Re^6: Please help with Regexp::Common by Your Mother (Archbishop) on Jan 19, 2017 at 19:14 UTC
Re^4: Please help with Regexp::Common by AnomalousMonk (Archbishop) on Jan 19, 2017 at 18:30 UTC
... passwords that do not contain any* words ...* Isn't that a bit like the CRM 114 Discriminator strategic communications security system, which for absolute top security was designed not to receive any messages...? Give a man a fish: `<%-{-{-{-<`	[reply] [d/l]