If you were to mask out 12312 right after finding it, you would remove the possibility of finding 23123. That's not so good.#!/usr/bin/perl my $string = "123123124"; my $len = 5; my %substrings; for (my $i = 0; $i + $len <= length $string; $i++) { my $sub = substr($string, $i, $len); $substrings{$sub}++; } print "$_ => $substrings{$_}\n" for sort { $substrings{$b} <=> $substrings{$a} || $a cmp $b } keys %substrings;
By the way, the above algorithm naively implements a method suggested by the first responder. I think that a possible savings (trading a lot of time for RAM) would be to use a file for the hash storage. But, I would recommend trying it out first before attempting to optimize it.
------
We are the carpenters and bricklayers of the Information Age.
Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.
Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
In reply to Re: Most common substring
by dragonchild
in thread Most common substring
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |