Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^2: substrings that consist of repeating characters

by salva (Canon)
on Sep 29, 2020 at 09:26 UTC ( [id://11122320]=note: print w/replies, xml ) Need Help??


in reply to Re: substrings that consist of repeating characters
in thread substrings that consist of repeating characters

Though, note that that regular expression in my comment above is pretty inefficient as it looks for the longest match at every character instead of skipping chunks of the same character once the match fails at the character starting it (the regular expression in the OP is much better in that regard).

We can use (*SKIP) to avoid that:

my $len = 0; my $best = ""; while ($string =~ /((.)(?:(*SKIP)\2){$len,})/g) { $len = length $1; $best = $1 } print "best: $best\n"

But that is still not completely efficient: the regexp is recompiled at every loop iteration because of $len, so maybe the following simpler code could be faster:

my $best = ""; while ($string =~ /((.)\2+)/g) { $best = $1 if length $1 > length $best } print "best: $best\n"

Or maybe this more convoluted variation:

my $best = ""; $best = $1 while $string =~ /((.)\2*)(*SKIP)(?(?{length $^N <= length +$best})(*FAIL))/g; print "best: $best\n"

Replies are listed 'Best First'.
Re^3: substrings that consist of repeating characters
by AnomalousMonk (Archbishop) on Sep 29, 2020 at 17:37 UTC

    Does that work?

    Win8 Strawberry 5.30.3.1 (64) Tue 09/29/2020 13:32:10 C:\@Work\Perl\monks >perl use strict; use warnings; my $string = 'AABBBBCCC'; my $len = 0; my $best = ""; while ($string =~ /((.)(?:(*SKIP)\2){$len,})/g) { $len = length $1; $best = $1 } print "best: '$best' \n" ^Z best: ''


    Give a man a fish:  <%-{-{-{-<

      Oops, no, it doesn't, but I think it should!

      Is that a bug in perl?

        I tried turning
        use re 'debug';
        on and comparing the output with
        /((.)(?:\2(*SKIP)){$len,})/g
        which seems to work, but I don't understand the output enough to be able to explain why the behaviour is different.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11122320]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2024-04-26 04:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found