http://qs1969.pair.com?node_id=11122349


in reply to Re^3: substrings that consist of repeating characters
in thread substrings that consist of repeating characters

Oops, no, it doesn't, but I think it should!

Is that a bug in perl?

Replies are listed 'Best First'.
Re^5: substrings that consist of repeating characters
by choroba (Cardinal) on Sep 29, 2020 at 21:20 UTC
    I tried turning
    use re 'debug';
    on and comparing the output with
    /((.)(?:\2(*SKIP)){$len,})/g
    which seems to work, but I don't understand the output enough to be able to explain why the behaviour is different.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      Aja, that has allowed me to see the problem!

      The (*SKIP) inside the repetition always matches, but once the end of the same-char sequence is reached the \2 fails, so it causes the repetition to backtrack, but it can't because of the (*SKIP), so all the repetition fails!

      Moving the (*SKIP) after the \2 fixes the issue:

      my $string = "ATTTAGTTCTTAAGGCTGACATCGGTTTACGTCAGCGTTACCCCCCAAGTTTTTTT +TTTTTTTTTTTATTGGGGACTTT"; my $len = 0; my $best = ""; while ($string =~ /((.)(\2(*SKIP)){$len,})/g) { $len = length $1; $best = $1 } print "best: $best\n"