in reply to Finding repeat sequences.
Not using regexps at all, but...
#!/usr/bin/env perl use 5.012; use Test::More; sub find_substring { my $input = shift; my $length = length $input; for my $i (1 .. $length) { my $possible = substr($input, 0, $i); my $repeated = $possible x (1 + int($length / $i)); return $possible if $input eq substr($repeated, 0, $length); } return ""; } my %eg = ( abcdabcdabcdabcdab => "abcd", abcdabcdabceabcdabcdabceab => "abcdabcdabce", aaaabaaaabaaaaabaaaab => "aaaabaaaaba", ); for my $input (sort keys %eg) { my $expected = $eg{$input}; my $got = find_substring($input); is($got, $expected, "result is '$expected' given input '$input'"); } done_testing;
Note that when there are multiple possible matches, this returns the shortest, because it doesn't make sense to return the longest - the longest is uninteresting.
For example, given the input abcabca, it could be that the answer is abc repeated two and a bit times, or abcabc repeated one and a bit times, or abcabca repeated exactly once. (Well, not really "repeated" but you know what I mean. The entire input string itself is always a valid and uninteresting answer.) Or, depending on how the problem is defined, the correct answer might be abcabcaxx repeated less than one time - i.e. the first repetition was truncated!
So the only interesting answer to return is the shortest possible one.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Finding repeat sequences.
by BrowserUk (Patriarch) on Jun 18, 2013 at 23:01 UTC | |
by hdb (Monsignor) on Jun 19, 2013 at 07:46 UTC |