Re: about style: use possessive or atomic?
by LanX (Saint) on Aug 16, 2015 at 11:01 UTC
|
Isn't the second one a redundant way for writing
/($match)+/x
?
update
No it's not, see my code further down.
update thread overview
Thread Tl;dr? After some confusion we identified a bug which was introduced with 5.20.
See Re^8: about style: use possessive or atomic? (BUG!!!)
| [reply] [d/l] |
|
|
+ Match 1 or more times
{n}+ Match exactly n times and give nothing back (redundant)
I take the "redundant" annotation to mean that it is the same as: +? Match 1 or more times, not greedily
Though "not greedily" seems to conflict with "or more"?
| [reply] [d/l] [select] |
|
|
chomp( $match = <DATA> );
chomp( $_ = <DATA> );
for my $regex (
"($match){1}+",
"($match)+",
"($match)+?",
){
print qw(FAIL SUCCESS)[ !! m/\A $regex \z/x ], "\n"
}
__DATA__
(a|b){2}
bbab
__OUTPUT__
FAIL
SUCCESS
SUCCESS
Is it true that ($match){1}+ equals to simply $match? | [reply] [d/l] |
|
|
|
|
|
|
|
|
|
|
|
|
I just ignored that the + is a possessive quantifier in this combination.
But its still redundant like the docs you cited say, backtracking wouldn't change without the plus. 1
(as far as I understand the docs with my pre coffee brain and without possibility to test on my mobile)
1) no that's not true, see discussion further down!
| [reply] |
Re: about style: use possessive or atomic?
by anonymized user 468275 (Curate) on Aug 16, 2015 at 17:39 UTC
|
| [reply] |
|
|
> something magical and unexpected,
Do you know what a atomic sub regex is?
On the long run understanding backtracking is crucial when handling composed expressions.
Especially when your scripts magically and unexpectedly deny terminating...
But I agree that the OP might have provided more context.
| [reply] |
|
|
OK I see your point - there always needs to be a mechanism to control backtracking. And I have to confess some bias over that approach. At some point I started to favour designing lexers and parsers over trying to make the regex engine do too much for me. Not sure why other than the idea of being able to test atoms of code rather than send multiple atoms to something I can't easily test at the atomic level. However, 'horses for courses', although I think it remains true that although simplified use of regex might be slower on execution than getting the most out of the regex engine, there is one distinct advantages to my own approach: simpler regexes are more maintainable and parsers do need to be maintained. Often it is easier to explain theory using an extreme example: imagine the task is to read in any XML that might be thrown at you and convert it into a tree in Perl (CPAN not allowed). Programmer A is committed to making the regex engine do as much work as possible in designing this parser. Programmer B is the opposite, believing in lexers and parsers in pure perl with minimal use of regex. Who will produce the best results in reasonable time? Be honest!
| [reply] |