Not using regexps at all, but...
#!/usr/bin/env perl use 5.012; use Test::More; sub find_substring { my $input = shift; my $length = length $input; for my $i (1 .. $length) { my $possible = substr($input, 0, $i); my $repeated = $possible x (1 + int($length / $i)); return $possible if $input eq substr($repeated, 0, $length); } return ""; } my %eg = ( abcdabcdabcdabcdab => "abcd", abcdabcdabceabcdabcdabceab => "abcdabcdabce", aaaabaaaabaaaaabaaaab => "aaaabaaaaba", ); for my $input (sort keys %eg) { my $expected = $eg{$input}; my $got = find_substring($input); is($got, $expected, "result is '$expected' given input '$input'"); } done_testing;
Note that when there are multiple possible matches, this returns the shortest, because it doesn't make sense to return the longest - the longest is uninteresting.
For example, given the input abcabca, it could be that the answer is abc repeated two and a bit times, or abcabc repeated one and a bit times, or abcabca repeated exactly once. (Well, not really "repeated" but you know what I mean. The entire input string itself is always a valid and uninteresting answer.) Or, depending on how the problem is defined, the correct answer might be abcabcaxx repeated less than one time - i.e. the first repetition was truncated!
So the only interesting answer to return is the shortest possible one.
In reply to Re: Finding repeat sequences.
by tobyink
in thread Finding repeat sequences.
by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |