in reply to Re: How to match more than 32766 times in regex?
in thread How to match more than 32766 times in regex?

use strict; use warnings; my $X = "a1b2c3d4e5"; # or use File::Slurp my $s = "(\\w\\d)"; # my pattern match $s my $m = qr/$s/; # compiled to a regular expression $m my $counter = 0; while($X=~s/$m//){ ++$counter; next unless $counter > 32766; # wait for it... print "this is the $counter iteration, got $1 \n"; }

Replies are listed 'Best First'.
Re^3: How to match more than 32766 times in regex?
by BrowserUk (Patriarch) on Dec 01, 2015 at 20:01 UTC

    No need to go to those lengths:

    $s = '0123456789' x 100000;; ( $m ) = $s =~ m[((?:(?:0123456789){32000}){3})];; print length $m;; 960000

    But for any given application there's almost certainly a better way of tackling the problem.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: How to match more than 32766 times in regex?
by Anonymous Monk on Dec 01, 2015 at 20:27 UTC
    Hmmm, I thought the OP had problems with 'complex regex recursion limit exceeded'. If he just wanted to match something like (\w\d){32767}, sure.
        "...the algorithm needs to be coded"

        It seems like someone did it already: Algorithm::NeedlemanWunsch.

        Regards, Karl

        «The Crux of the Biscuit is the Apostrophe»

        Ah yes, in this node Re^2: Complex regular subexpression recursion limit I didn't get an answer :/ .
        Today I was solving another problem (and encountered same limitation). Full problem was: given a string (up to 1e5 length) consisting of '0' and '1', answer what is the length of the longest alternating subsequence if you are able to choose and invert one substring. For example, given a string '100111', I can invert substring from 3rd to 4th character ( substr $line, 2, 2, (substr $line, 2, 2) =~ y/01/10/r ), and then string become '101011' and has alternating subsequence (indexes: 0,1,2,3,4 or 0,1,2,3,5).
        I wanted to solve that problem with regexes (I knew that I can solve it other way), so I tried to count /1+/ and /0+/ (this is the answer of longest alternating subsequence if no inversions are made). I thought that I can do:
        $line =~ y/1/,/; $len = split /\b/, $line;
        , but I decided to stay with zeroes and ones, and wrote  () = $line =~ /(.)\1*/g (as I shown). Later I add to $len:  /(.)\1\1|(.)\2.*(.)\3/ + /(.)\1/, because each regex if succedes it gives +1 to the possible length of subsequence after one inversion.
        I often try to solve problems from competitive programming online sites or sites like projecteuler.net and I practise do it with Perl.
        After I used to calc all the sum:
        $len = + (() = /(.)\1*\1*\1*\1*/g) + /(.)\1\1|(.)\2.*.*.*.*(.)\3/ + /(.)\1/
        - it consumed too much time when solving input line '01' x 5e4;

        upd: was bad example with reversion, now fixed to inversion.