Re: Seeking algorithm for finding common continous sub-patterns

So you mean subsequences of at least length n? Do the m occurences have to be in different elements of @a? If so, do the elements have to be adjacent? (I'm trying to figure out what you mean by 'continuous'.)

Update: If you really meant at least, would you want (1,2,3,1,2,4,1,2,4) to show (1,2,4) and (1,2) or just (1,2,4)?

Update: assuming "continuous" only meant not considering (1,3,5) to be a subsequence of (1,2,3,4,5), and that you were just abbreviating the fact that there were matching (6,21) and (21,5) by saying (6,21,5), and assuming all your data are integers > 0:

use warnings;
use strict;
no warnings "utf8";

my @a = ( [2,5,10,5,12,6,21,5,10,12,23],
       [5,6,11,10,5,10,6,21,5,1,9],
       [6,5,10,15,21]
    );
my $m = 2;
my $n = 2;

my $big = join "\0", map join('', map chr, @$_), @a;
$big =~ tr/\0\n/\n\0/;

my %uniq;
my @m2 = map [map ord||10, split //],
   grep !$uniq{$_}++,
   $big =~ /(?=(.{$n})(?s:.*?\1){${\($m-1)}})./g;

use Data::Dumper;
print Data::Dumper->new([$_])->Terse(1)->Indent(0)->Dump(), "\n"
    for @m2;
[download]

but it doesn't scale well (but may scale as well as any other solution.)

Comment on Re: Seeking algorithm for finding common continous sub-patterns Download Code

Replies are listed 'Best First'.
Re^2: Seeking algorithm for finding common continous sub-patterns by johnnywang (Priest) on Dec 03, 2004 at 19:11 UTC
You're all correct: length at least n, in m different sequences, and adjacent (ordered).	[reply]