Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^2: Is it safe to use external strings for regexes? (infinite loops)

by LanX (Saint)
on Oct 11, 2021 at 12:31 UTC ( [id://11137416]=note: print w/replies, xml ) Need Help??


in reply to Re: Is it safe to use external strings for regexes?
in thread Is it safe to use external strings for regexes?

Hi

I just stumbled over an example for something even worse:

infinite loops

perlre#Repeated Patterns Matching a Zero-length Substring

> A common abuse of this power stems from the ability to make infinite loops using regular expressions, with something as innocuous as:

> "foo" =~ m{ ( o? )* }x;

> The o? matches at the beginning of "foo", and since the position in the string is not moved by the match, o? would match again and again because of the "*" quantifier. Another common way to create a similar cycle is with the looping modifier /g:

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re^3: Is it safe to use external strings for regexes? (infinite loops)
by choroba (Cardinal) on Oct 11, 2021 at 13:29 UTC
    Huh?
    $ perl -wE '$f = "foo"; say pos $f while $f =~ m{ ( o? )* }gx;' 0 3 3

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      dunno, I just RTFM and copied it here! :)

      maybe some delta fixed it?

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        Maybe. It doesn't loop even on 5.16.3 which is the oldest Perl easily available to me.

        Update: 5.10.1 neither.

        Update2: It loops in 5.6.2!

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

        Nor on 5.8.9:

        Win8 Strawberry 5.8.9.5 (32) Mon 10/11/2021 13:52:47 C:\@Work\Perl\monks >perl -wMstrict -le "my $f = 'foo'; print pos $f while $f =~ m{ ( o? ) +* }gx;" 0 3 3


        Give a man a fish:  <%-{-{-{-<

Re^3: Is it safe to use external strings for regexes? (infinite loops)
by AnomalousMonk (Archbishop) on Oct 11, 2021 at 18:01 UTC

    The section to which you linked goes on to say

    Thus Perl allows such constructs, by forcefully breaking the infinite loop.


    Give a man a fish:  <%-{-{-{-<

      Update: Oops... I meant this post as a reply to this post, not to myself, but that's ok, no need to re-parent. :)


      As you have, I think, suggested elsewhere and as the documentation itself cheerfully admits, a rewrite of this section would be welcome.

      That the section begins with a discussion of the evils of zero-width match infinite loops accompanied by a bunch of Perl code examples of such matches that don't actually "work" (in the sense that they don't produce infinite loops) is not helpful. The discussion finally gets around to saying that the Perl RE does not, in fact, allow such loops, but by then one may have been led far down the garden path and abandoned in the dark forest.


      Give a man a fish:  <%-{-{-{-<

        While I like that Perl doesn't allow this to block the engine, I'm not too sure about the solution.

        Ignoring regex grammar to silently continue might be worse than throwing an explicit warning.

          Regex-Problem: Empty match detected, trying continuation...

        Alike the "deep recursion" warning if Perl dives 100 times deep into the same sub.

        But I don't think I know the RE algebra good enough to tell.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

        update

        s/1000/100/

      > Thus Perl allows such constructs, by forcefully breaking the infinite loop.

      Thanks, that wasn't clear to me.

      BUT I should have taken more care about the

      > WARNING: Difficult material (and prose) ahead. This section needs a rewrite.

      FWIW, the forced break can be seen with re 'debug'

      D:\tmp>perl -Mre=debug -e"'foo' =~ m{ ( o? )* }x;" Compiling REx " ( o? )* " Final program: 1: CURLYX[0]{0,INFTY} (12) 3: OPEN1 (5) 5: CURLY{0,1} (9) 7: EXACT <o> (0) 9: CLOSE1 (11) 11: WHILEM[1/1] (0) 12: NOTHING (13) 13: END (0) minlen 0 Matching REx " ( o? )* " against "foo" 0 <> <foo> | 0| 1:CURLYX[0]{0,INFTY}(12) 0 <> <foo> | 1| 11:WHILEM[1/1](0) | 1| WHILEM: matched 0 out of 0..65535 0 <> <foo> | 2| 3:OPEN1(5) 0 <> <foo> | 2| 5:CURLY{0,1}(9) | 2| EXACT <o> can match 0 times out +of 1... 0 <> <foo> | 3| 9:CLOSE1(11) 0 <> <foo> | 3| 11:WHILEM[1/1](0) | 3| WHILEM: matched 1 out of 0..655 +35 | 3| WHILEM: empty match detected, t +rying continuation... <---- HERE + 0 <> <foo> | 4| 12:NOTHING(13) 0 <> <foo> | 4| 13:END(0) Match successful! Freeing REx: " ( o? )* " D:\tmp>

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11137416]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (7)
As of 2024-03-28 22:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found