Re^3: Is it safe to use external strings for regexes? (infinite loops)
by choroba (Cardinal) on Oct 11, 2021 at 13:29 UTC
|
$ perl -wE '$f = "foo"; say pos $f while $f =~ m{ ( o? )* }gx;'
0
3
3
map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] [d/l] |
|
|
Win8 Strawberry 5.8.9.5 (32) Mon 10/11/2021 13:52:47
C:\@Work\Perl\monks
>perl -wMstrict -le "my $f = 'foo'; print pos $f while $f =~ m{ ( o? )
+* }gx;"
0
3
3
Give a man a fish: <%-{-{-{-<
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re^3: Is it safe to use external strings for regexes? (infinite loops)
by AnomalousMonk (Archbishop) on Oct 11, 2021 at 18:01 UTC
|
| [reply] [Watch: Dir/Any] [d/l] |
|
Update: Oops... I meant this post as a reply to this post, not to myself, but that's ok, no need to re-parent. :)
As you have, I think, suggested elsewhere and as the documentation itself cheerfully admits, a rewrite of this section would be welcome.
That the section begins with a discussion of the evils of zero-width match infinite loops accompanied by a bunch of Perl code examples of such matches that don't actually "work" (in the sense that they don't produce infinite loops) is not helpful. The discussion finally gets around to saying that the Perl RE does not, in fact, allow such loops, but by then one may have been led far down the garden path and abandoned in the dark forest.
Give a man a fish: <%-{-{-{-<
| [reply] [Watch: Dir/Any] [d/l] |
|
While I like that Perl doesn't allow this to block the engine, I'm not too sure about the solution.
Ignoring regex grammar to silently continue might be worse than throwing an explicit warning.
Regex-Problem: Empty match detected, trying continuation...
Alike the "deep recursion" warning if Perl dives 100 times deep into the same sub.
But I don't think I know the RE algebra good enough to tell.
update
s/1000/100/
| [reply] [Watch: Dir/Any] |
|
|
> Thus Perl allows such constructs, by forcefully breaking the infinite loop.
Thanks, that wasn't clear to me.
BUT I should have taken more care about the
> WARNING: Difficult material (and prose) ahead. This section needs a rewrite.
FWIW, the forced break can be seen with re 'debug'
D:\tmp>perl -Mre=debug -e"'foo' =~ m{ ( o? )* }x;"
Compiling REx " ( o? )* "
Final program:
1: CURLYX[0]{0,INFTY} (12)
3: OPEN1 (5)
5: CURLY{0,1} (9)
7: EXACT <o> (0)
9: CLOSE1 (11)
11: WHILEM[1/1] (0)
12: NOTHING (13)
13: END (0)
minlen 0
Matching REx " ( o? )* " against "foo"
0 <> <foo> | 0| 1:CURLYX[0]{0,INFTY}(12)
0 <> <foo> | 1| 11:WHILEM[1/1](0)
| 1| WHILEM: matched 0 out of 0..65535
0 <> <foo> | 2| 3:OPEN1(5)
0 <> <foo> | 2| 5:CURLY{0,1}(9)
| 2| EXACT <o> can match 0 times out
+of 1...
0 <> <foo> | 3| 9:CLOSE1(11)
0 <> <foo> | 3| 11:WHILEM[1/1](0)
| 3| WHILEM: matched 1 out of 0..655
+35
| 3| WHILEM: empty match detected, t
+rying continuation... <---- HERE
+
0 <> <foo> | 4| 12:NOTHING(13)
0 <> <foo> | 4| 13:END(0)
Match successful!
Freeing REx: " ( o? )* "
D:\tmp>
| [reply] [Watch: Dir/Any] [d/l] [select] |