Actually, there's a much simpler solution (codewise):

sub findCycle { my $str = @_ ? shift : ''; $str =~ /^(.*?)\1*$/; return wantarray ? ( length $1, $1 ) : length $1; }
Though I suspect that, for long strings, a solution like the one you proposed but testing only cycles whose lengths are divisors of the input length would be faster. I'll do some benchmarking when I get a chance.

Update:

Caveat: take my benchmarks with a barrel of salt; despite many burns, I still manage to botch benchmarks routinely.

OK, one could test many possible variants. Below I test the case of a string of length about 10_000, consisting of a repeating pattern whose length is given as an input to the benchmarking script. I test three solutions: (0) the original algorithm (plus the minor optimization of not testing cycles longer than half the original string); (1) the modification of the original that tests only cycles whose lengths are divisors of the input string's slength; (2) the regex solution above. For a cycle length of 101 (in a string of length 9999), the results are:

Rate 0 1 2 0 906/s -- -58% -85% 1 2143/s 137% -- -64% 2 5966/s 559% 178% --
So, contrary to my expectation, the regex solution wins handily. On the other hand, for a cycle-free string of comparable length (9973), solution 1 is the clear winner:

Rate 0 2 1 0 17.0/s -- -89% -99% 2 155/s 812% -- -94% 1 2465/s 14431% 1494% --

And for short cycles in a long string, your original solution cleans house:

Rate 2 1 0 2 222/s -- -96% -98% 1 5023/s 2161% -- -52% 0 10479/s 4615% 109% --
That's for a cycle of length 3 in a string of length 9999.

Full code below:

use strict; use warnings; use Math::Pari 'divisors'; use Benchmark 'cmpthese'; my $len = shift || 3; srand 0; { $::input = join '', map int( rand 9 ), 1..$len; last if length $::input == findCycle_0( $::input ); redo; } $::input x= int( 10_000 / $len ); cmpthese( -1, { 0 => 'scalar findCycle_0( $::input )', 1 => 'scalar findCycle_1( $::input )', 2 => 'scalar findCycle_2( $::input )', } ); sub findCycle_0 { my $str = shift; my $copy = $str; my $strLen = length $str; for ( 1 .. $strLen/2 ) { $copy .= substr $copy, 0, 1, ''; return wantarray ? ($_, substr $str, 0, $_) : $_ if $str eq $copy; } return wantarray ? ($strLen, $str) : $strLen; } sub findCycle_1 { my $str = shift; my $strLen = length $str; for ( @{ divisors( $strLen ) } ) { my $copy = $str; $copy .= substr( $copy, 0, $_, '' ); return wantarray ? ($_, substr $str, 0, $_) : $_ if $str eq $copy; } } sub findCycle_2 { my $str = @_ ? shift : ''; $str =~ /^(.*?)\1*$/; return wantarray ? ( length $1, $1 ) : length $1; }

the lowliest monk


In reply to Re: Find repeated patterns in strings by tlm
in thread Find repeated patterns in strings by GrandFather

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.