in reply to Seeking algorithm for finding common continous sub-patterns

So you mean subsequences of at least length n? Do the m occurences have to be in different elements of @a? If so, do the elements have to be adjacent? (I'm trying to figure out what you mean by 'continuous'.)

Update: If you really meant at least, would you want (1,2,3,1,2,4,1,2,4) to show (1,2,4) and (1,2) or just (1,2,4)?

Update: assuming "continuous" only meant not considering (1,3,5) to be a subsequence of (1,2,3,4,5), and that you were just abbreviating the fact that there were matching (6,21) and (21,5) by saying (6,21,5), and assuming all your data are integers > 0:

use warnings; use strict; no warnings "utf8"; my @a = ( [2,5,10,5,12,6,21,5,10,12,23], [5,6,11,10,5,10,6,21,5,1,9], [6,5,10,15,21] ); my $m = 2; my $n = 2; my $big = join "\0", map join('', map chr, @$_), @a; $big =~ tr/\0\n/\n\0/; my %uniq; my @m2 = map [map ord||10, split //], grep !$uniq{$_}++, $big =~ /(?=(.{$n})(?s:.*?\1){${\($m-1)}})./g; use Data::Dumper; print Data::Dumper->new([$_])->Terse(1)->Indent(0)->Dump(), "\n" for @m2;
but it doesn't scale well (but may scale as well as any other solution.)

Replies are listed 'Best First'.
Re^2: Seeking algorithm for finding common continous sub-patterns
by johnnywang (Priest) on Dec 03, 2004 at 19:11 UTC
    You're all correct: length at least n, in m different sequences, and adjacent (ordered).