Another regex to solve ...

pat_mc has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Another regex to solve ... by Not_a_Number (Prior) on Aug 18, 2011 at 17:58 UTC
`my @tests = qw/ p up hip hop hoop heap help hops /; for ( @tests ) { say if reverse =~ /^p([aeiou])(?!\1)/; }` [download]	[reply] [d/l]
Re^2: Another regex to solve ... by pat_mc (Pilgrim) on Aug 19, 2011 at 19:44 UTC
Gooooood thinking, Not_a_Number!!!! I like this one ... reversing the string before matching it ... how neat is that? Excellent! Goooood on ya! And thanks for the idea. This is coooooool stuff! And who ever said 'regex golfing was a waste of time'? Far from it ... it's like Perl philosophy ... pure aesthetics ... sheer bliss. Thanks agani for this near little twist!	[reply]
Re: Another regex to solve ... by toolic (Bishop) on Aug 18, 2011 at 16:35 UTC
Not the most elegant: `use warnings; use strict; while (<DATA>) { chomp; if (/[^aeiou][aeiou]p$/i) { print "$_ match\n"; } else { print "$_ no match\n"; } } __DATA__ carp step tip stoop steep asp food mop up` [download] prints: `carp no match step match tip match stoop no match steep no match asp no match food no match mop match up no match` [download] As you can see, the 2-letter `up` does not match. Is that ok? UPDATE: I like AR's solution better because it does match for `up`.	[reply] [d/l] [select]
Re^2: Another regex to solve ... by pat_mc (Pilgrim) on Aug 18, 2011 at 16:41 UTC
Yeah, sorry ... same thing: by 'double vowel' I meant the same vowel occurring twice ...	[reply]
Re: Another regex to solve ... by AR (Friar) on Aug 18, 2011 at 16:34 UTC
If you define vowel as one of 'a', 'e', 'i', 'o' and 'u' (because I know some other monks will point out that I'm being anglocentric :)), a regex that fits your criteria is: `/(?<![aeiou])[aeiou]p\b/` Please adjust for case sensitivity as necessary. Edit: Did you mean two vowels in a row or the same vowel twice before the 'p'? The regex above solves the former, but not the latter.	[reply] [d/l]
Re^2: Another regex to solve ... by pat_mc (Pilgrim) on Aug 18, 2011 at 16:38 UTC
Hi, AR - Thanks for your proposal ... but that's not precisely what I wanted ... I do want words like 'feap' to pass ... only words in which the same vowel is duplicated before the 'p' should be filtered out. Sorry for not making this perfectly clear right from the start. Any alternative suggestions from your side then? Cheers - Pat	[reply]
Re^3: Another regex to solve ... by AR (Friar) on Aug 18, 2011 at 16:46 UTC
`/((\b\|[^aeiou])[aeiou]\|[eiou]a\|[aiou]e\|[aeou]i\|[aeiu]o\|[aeio]u)p\b/` but it's not even remotely elegant. I'll keep working on it.	[reply] [d/l]
Re^4: Another regex to solve ... by pat_mc (Pilgrim) on Aug 18, 2011 at 16:50 UTC
Re^5: Another regex to solve ... by AR (Friar) on Aug 18, 2011 at 16:58 UTC
Re: Another regex to solve ... (\2) by tye (Sage) on Aug 18, 2011 at 16:59 UTC
`local $_= "Stop tip stoop put up, tops steep 'creap' sleep soap!"; my @words; push @words, $1 # while /(\b\w(?!(.)\2)\w[aeiou]p\b)/g; while /(\b(?:\w(?!(.)\2)\w)?[aeiou]p\b)/g; print "( @words )\n"; __END__ ( Stop tip up creap soap )` [download] Update: Changed . to \w so would not match, for example, "no-op" (if you want '-' allowed in words, then replace \w with, for example, `[-\w]` both places). Then: added (?:...)? to match two-letter words (since my original attempt that handled two-letter words fails because `(?<!\2)` is not smart enough to realize the fixed length of \2). - tye	[reply] [d/l] [select]
Re^2: Another regex to solve ... (more \2) by tye (Sage) on Aug 18, 2011 at 19:26 UTC
Here are some other ways to do it, including one that doesn't work and one that almost works... local $_= "Up stop 'Oop' tip stoop put\nup, tops steep 'creap' sleep s +oap!"; my @words; push @words, $1 # Misses "up" if first word in string: # while /(\b\w(?<=(.))(?!\2)[aeiou]p\b)/gi; # Would work if (?<=...\|...) were smarter: # while /(\b\w(?<=^\|(.)(?!\2))[aeiou]p\b)/gsi; # How to work around (?<=...\|...) being dumb: # while /(\b\w(?:(?<=^)\|(?<=(.)(?!\2)))[aeiou]p\b)/gsi; # (?<=^) can be shortened to just ^: while /(\b\w(?:^\|(?<=(.)(?!\2)))[aeiou]p\b)/gsi; # Or just skip the complex check for 2-letter words: # while /(\b(?:\w*(?<=(.))(?!\2))?[aeiou]p\b)/gsi; print "( @words )\n"; __END__ ( Up stop tip up creap soap ) [download] - tye	[reply] [d/l]
Re: Another regex to solve ... by johngg (Canon) on Aug 18, 2011 at 17:49 UTC
I've not done much testing but this seems to work by putting the capture in the look-behind. `knoppix@Microknoppix:~$ perl -E ' > for ( qw{ soap creep top groat loop } ) > { > say unless m{(?<=([aeiou]))\1p\z}; > }' soap top groat knoppix@Microknoppix:~$` [download] I hope this is helpful. Cheers, JohnGG	[reply] [d/l]
Re^2: Another regex to solve ... by Not_a_Number (Prior) on Aug 18, 2011 at 23:34 UTC
`say unless m{(?<=([aeiou]))\1p\z};` That allows eg 'help' or slurp'. Update: I've no idea why though, or why it also allows 'hops' and 'hoops'... Update 2: Got it! It also of course allows 'rabbit', or any other string that doesn't end in 'p' preceded by whatever. It was the `unless` that momentarily confused me. :)	[reply] [d/l] [select]
Re^3: Another regex to solve ... by johngg (Canon) on Aug 19, 2011 at 17:38 UTC
Yes, that was a pretty woeful attempt on my part. I must have been thinking about a sub-set of words all ending with 'p' rather than the general case :-( Cheers, JohnGG	[reply]
Re^2: Another regex to solve ... by pat_mc (Pilgrim) on Aug 19, 2011 at 19:35 UTC
Hi, johngg - I like the approach of back-referencing the match from the look-behing ... the only issue I have with the code you propose is that it over-generates in the sense that it will also pass strings all other strings that do not match the regex like 'stp' ... and that, of course, it shouldn't since we only want words to pass that have a non-double vowels in front of the word-terminal 'p'.	[reply]
Re: Another regex to solve ... by locked_user sundialsvc4 (Abbot) on Aug 18, 2011 at 17:03 UTC
Can you solve your problem, sufficiently well, using a combination of regexes and procedural code? One regular expression could, for example, locate all source lines in the document which contain “a vowel that is not immediately followed by another vowel,” leading to an `if-`statement in which the matching lines are further examined by whatever means seem appropriate. Sure, “regex golf” is instructive. It can even be entertaining. But it can also devolve into a waste of time...
Re^2: Another regex to solve ... by pat_mc (Pilgrim) on Aug 19, 2011 at 19:05 UTC
The answer would be "Yes" on all accounts :-)	[reply]