Regexp for alphabetical order match within the string

prostoalex has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Regexp for alphabetical order match within the string by hardburn (Abbot) on Oct 30, 2003 at 18:56 UTC
Do you really need it to be a regex? `# $str defined elsewhere print "Is alphabetical\n" if( $str eq join '', sort split //, $str );` [download] ---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident. -- Schemer `: () { :\|:& };:` Note: All code is untested, unless otherwise stated	[reply] [d/l] [select]
Re: Re: Regexp for alphabetical order match within the string by sauoq (Abbot) on Oct 30, 2003 at 19:09 UTC
You can avoid the sort for an O(N) solution: `sub is_alphabetical { my @c = split //, shift; ord($c[$_]) >= ord($c[$_ - 1]) or return 0 for 1 .. $#c; return 1; }` [download] Update: Yes, my use of `ord()` is unnecessary. Thanks tlhf++. Edit: Added a missing `my`. -sauoq "My two cents aren't worth a dime.";	[reply] [d/l] [select]
Re: Re: Re: Regexp for alphabetical order match within the string by tlhf (Scribe) on Oct 30, 2003 at 19:45 UTC
Elegent solution, tho the `ord`s are needless. `sub is_alphabetical { my @c = split //, shift; $c[$_] ge $c[$_-1] or return 0 for 1..$#c; return 1; }` [download] tlhf (Everyone forgets about ge and le ^_^)	[reply] [d/l] [select]
Re: Re: Re: Regexp for alphabetical order match within the string by pg (Canon) on Oct 30, 2003 at 20:37 UTC
++, but the original post obviously allows non-alphabetical chars in the string, and their orders are not considered. A small improvement to meet the original requirement: `$str = "aBcD12ef7812g"; print is_alphabetical($str); sub is_alphabetical { (my $str = lc(shift)) =~ s/([^a-z])//g; my @c = split //, $str; $c[$_] ge $c[$_ - 1] or return 0 for 1 .. $#c; return 1; }` [download]	[reply] [d/l]
Re:x4 Regexp for alphabetical order match within the string by Roy Johnson (Monsignor) on Oct 30, 2003 at 21:32 UTC
Re: Re: Re: Re: Regexp for alphabetical order match within the string by sauoq (Abbot) on Oct 30, 2003 at 21:00 UTC
Re: Re: Regexp for alphabetical order match within the string by asarih (Hermit) on Oct 30, 2003 at 19:37 UTC
But this won't take case into consideration, would it? "Ba" is not really alphabetical in my book.	[reply]
Re: Re: Re: Regexp for alphabetical order match within the string by tlhf (Scribe) on Oct 30, 2003 at 19:56 UTC
For case-insensitivity just throw an `lc` before the `shift`.	[reply] [d/l] [select]
Re: Regexp for alphabetical order match within the string by sgifford (Prior) on Oct 30, 2003 at 19:53 UTC
How about `! /(.).(??{"[^$1-z]"})/ix`? #!/usr/bin/perl -w use strict; if ($ARGV[0] !~ /(.).(??{"[^$1-z]"})/ix) { print "alpha\n"; } else { print "non-alpha\n"; } __END__ [sgifford@sglaptop sgifford]$ perl /tmp/t4 abcxz alpha [sgifford@sglaptop sgifford]$ perl /tmp/t4 abcda non-alpha [sgifford@sglaptop sgifford]$ perl /tmp/t4 aaaaaaab alpha [sgifford@sglaptop sgifford]$ perl /tmp/t4 aaaaaaabcccccccz alpha [sgifford@sglaptop sgifford]$ perl /tmp/t4 aaaaaaabcccccdccz non-alpha [sgifford@sglaptop sgifford]$ perl /tmp/t4 aaaaaaaaaaaaa alpha [sgifford@sglaptop sgifford]$ perl /tmp/t4 abcdefghijklmnopqrstuvwxyz alpha [download]	[reply] [d/l] [select]
Re: Re: Regexp for alphbetical order match within the string by Roy Johnson (Monsignor) on Oct 30, 2003 at 21:09 UTC
Very clever, -10. (I got that on an assignment once.) Surprisingly, your expression is not case-insensitive, despite the /i switch. That is because you can end up with character classes like `[B-z]`. One possible remedy: `/(.)(??{"[^lc($1)-z]"})/`. (Also note that you don't need the .). The other thing about your regex is that it is about 30 times* slower than the other recommendations. Don't even try to benchmark it for more than a few thousand iterations. But it is cool. The other solutions are all comparable to each other for performance. My benchmarking code attached. Read more... (2 kB)	[reply] [d/l] [select]
Re: Re: Re: Regexp for alphbetical order match within the string by BrowserUk (Patriarch) on Oct 30, 2003 at 22:33 UTC
Just for grins, this tests about 50% quicker than the substr version in your benchmark. `sub is_sorted{ my( $str, $p, $x ) = ( lc shift, 0 ); chop$x lt $x and return 0 while $x = substr $str, $p++, 2; 1; };` [download] Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail Hooray!	[reply] [d/l]
Re: Regexp for alphbetical order match within the string by Roy Johnson (Monsignor) on Oct 31, 2003 at 14:47 UTC
Re: Re: Regexp for alphabetical order match within the string by Nkuvu (Priest) on Oct 30, 2003 at 20:21 UTC
Urg. Could you explain that? I'm not following the `??{"[^$1-z]"})` bit.	[reply] [d/l]
Re: Re: Re: Regexp for alphabetical order match within the string by sgifford (Prior) on Oct 30, 2003 at 20:43 UTC
OK, here's what perlre(1) says about `(??{})`: ""(??{ code })"" WARNING: This extended regular expression fea- ture is considered highly experimental, and may be changed or deleted without notice. A simpli- fied version of the syntax may be introduced for commonly used idioms. This is a "postponed" regular subexpression. The "code" is evaluated at run time, at the moment this subexpression may match. The result of evaluation is considered as a regular expres- sion and matched as if it were inserted instead of this construct. So the RE first matches any character `$1`, followed by zero or more of any character. Then it evaluates the code in the `(??{})`. This code evaluates to a character class for a character outside of the range `[$1-z]`---a character earlier in the alphabet. If you change the code to print what it's trying, it's clearer what's going on: `if ($ARGV[0] !~ /(.).*(??{print "Searching for [^$1-z] starting at ",p +os($ARGV[0]),"\n"; "[^$1-z]"})/ix)` [download] produces `[sgifford@sglaptop sgifford]$ perl /tmp/t4 abcd Searching for [^a-z] starting at 4 Searching for [^a-z] starting at 3 Searching for [^a-z] starting at 2 Searching for [^a-z] starting at 1 Searching for [^b-z] starting at 4 Searching for [^b-z] starting at 3 Searching for [^b-z] starting at 2 Searching for [^c-z] starting at 4 Searching for [^c-z] starting at 3 Searching for [^d-z] starting at 4 alpha` [download] Other solutions are more efficient, but the OP asked for an RE. :-)	[reply] [d/l] [select]
Re: Re: Re: Re: Regexp for alphabetical order match within the string by Nkuvu (Priest) on Oct 30, 2003 at 22:01 UTC
Re: Regexp for alphabetical order match within the string by runrig (Abbot) on Oct 30, 2003 at 19:01 UTC
You might want to at least make that easier to generate, and anchor the regex (otherwise you'll always get a match): `my $str = join '', map "$_*", "a".."z"; my $re = qr/^$str$/; print "Matches\n" if "abc" =~ $re; print "Doesn't match\n" unless "zbc" =~ $re;` [download]	[reply] [d/l]
Re: Re: Regexp for alphabetical order match within the string by CountZero (Bishop) on Oct 30, 2003 at 19:30 UTC
And perhaps make the regex case-insensitive. CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law	[reply]
Re: Regexp for alphabetical order match within the string by duff (Parson) on Oct 30, 2003 at 20:07 UTC
Just to be different, here's a solution that doesn't use REs or split(): `sub isalphabetical { for my $i (0..(length($_[0])-2)) { return 0 if lc(substr($_[0],$i,1)) gt lc(substr($_[0],$i+1,1)); } return 1; }` [download]	[reply] [d/l]
Re: Re: Regexp for alphabetical order match within the string by aquarium (Curate) on Oct 31, 2003 at 05:46 UTC
my (off the top of my head) comment is: don't use regex...convert each character into ASCI or your "true" alphabetically mapped sequence number (eg. aA->1 etc.) and loop through the string, keeping the highest number seen in a variable. As soon as you hit a lower value, exit the loop and print "not alphabetical". This should beat regexp for longer/complex strings, and you can make your own map of exactly what "alphabetical" is (in Chinese if you like). The solution ends up more portable, flexible (not tied to regex syntax), and runs faster. Sorry, but am about to disembark and go home, so i'll post some code later.	[reply]
Re: Regexp for alphabetical order match within the string by Abigail-II (Bishop) on Oct 31, 2003 at 10:21 UTC
You sound as if you think that `/abc...yz/` [download] isn't an acceptable solution. I think that once you have placed anchors, it's a very fine solution. The regex might be long, it's simple*. There are no alternatives. There's no backreferencing. There are no delayed regexes. There will be no backtracking for matching strings, and only minimal backtracking in case of failures. Remember that the less alternatives you give the regexp engine, the faster it (usually) be. Abigail	[reply] [d/l]
Re: Re: Regexp for alphabetical order match within the string by jweed (Chaplain) on Oct 31, 2003 at 21:24 UTC
I would suggest that you enclose the whole thing in the non-backtracking setup (?>pattern). Since if it doesn't match at first it won't match at all, you can speed up failure by enclosing the whole thing in (?> and ). `/^(?>abc...yz*)$/` [download]	[reply] [d/l]


"be consistent"
	PerlMonks