1: =pod
2:
3: @longest = pinyan_UCS($string)
4:
5: This function returns the set of the longest substrings
6: in a given string. It seems rather efficient, even though
7: it calls C<index()> quite a bit. I found that using a hash
8: to figure if I'd seen a character had adverse effects.
9:
10: =cut
11:
12: sub pinyan_UCS {
13: my $str = shift;
14: my $len = length $str;
15: my ($diff,$biggest) = (0,0);
16: my ($jump,@ahead,@matches);
17:
18: for (my $i = 0; $i < $len; ) {
19: my $match = [ $i, $len ];
20: if ($len - $i >= $biggest) {
21: for (my $k = $i; $k < $match->[1]; $k++) {
22: $ahead[$k] ||= index($str, substr($str,$k,1), $k+1);
23: if ($ahead[$k] != -1 and $match->[1] > $ahead[$k]) {
24: $match->[1] = $ahead[$k];
25: $jump = $k;
26: }
27: }
28:
29: $diff = $match->[1] - $match->[0];
30:
31: if ($diff > $biggest) { ($biggest,@matches) = ($diff,$match) }
32: elsif ($diff == $biggest) { push @matches, $match; }
33: }
34: else { last }
35:
36: $i = ++$jump;
37: }
38:
39: return map substr($str, $_->[0], $_->[1] - $_->[0]), @matches;
40: }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Unique-Character Substring
by salvadors (Pilgrim) on Jan 20, 2001 at 19:59 UTC | |
by tilly (Archbishop) on Jan 20, 2001 at 21:22 UTC | |
by salvadors (Pilgrim) on Jan 20, 2001 at 21:40 UTC | |
by japhy (Canon) on Jan 20, 2001 at 20:07 UTC | |
|
Re (tilly) 1: Unique-Character Substring
by tilly (Archbishop) on Jan 20, 2001 at 14:16 UTC | |
by salvadors (Pilgrim) on Jan 20, 2001 at 19:40 UTC | |
by tilly (Archbishop) on Jan 20, 2001 at 20:24 UTC | |
|
Re: Unique-Character Substring
by dchetlin (Friar) on Jan 21, 2001 at 09:08 UTC | |
by salvadors (Pilgrim) on Jan 21, 2001 at 18:45 UTC |