Re: Finding common substrings

split each string and look for dupes ..

$a = 'PF01389   6   218 1   255 430.09';
$b = 'PF00691   PF01389';

my %counts;
foreach ( split /\s+/, $a ) {
    $counts{$_} = 1;
}
foreach ( split /\s+/, $b ) {
    $counts{$_}++ if exists $counts{$_};
}
my @common = grep $counts{$_} > 1, keys %counts;

if ( @common ) {
    print "correct\n";
}
[download]

or, less verbose,

$a = 'PF01389   6   218 1   255 430.09';
$b = 'PF00691   PF01389';

my %in_a    = map  { $_ => 1          } split /\s+/, $a;
my @in_both = grep { exists $in_a{$_} } split /\s+/, $b;

if ( @in_both ) {
    print "correct\n";
}
[download]

Comment on Re: Finding common substrings Select or Download Code

Replies are listed 'Best First'.
Re^2: Finding common substrings by ktsirig (Sexton) on Sep 20, 2006 at 22:46 UTC
Thank you all! You really helped me understand a lot of things just by this question I had!	[reply]
Re^2: Finding common substrings by johngg (Canon) on Sep 21, 2006 at 09:19 UTC
I might be wrong but I think your first method will give a false positive if one string contains a duplicated word but that word doesn't appear in the other string. The `$counts{$_}` will be more than one but only because the word appeared twice in the same string, not because it was duplicated in the other string. Cheers, JohnGG	[reply] [d/l]
Re^3: Finding common substrings by mreece (Friar) on Sep 21, 2006 at 16:36 UTC
actually, it won't, because the first foreach only sets to 1 and not ++, and the second foreach only does ++ it if already exists, which means it was already found in `$a`.	[reply] [d/l]
Re^4: Finding common substrings by johngg (Canon) on Sep 21, 2006 at 20:50 UTC
Doh! I must be going mad :( Sorry, JohnGG	[reply]