If I correctly understand your problem, you want to find cases where two complete values are the same (the things separated by "~~"), not just any old substring.
To find "a value I've seen before", the quickest way to do it is usually to have that value set as the key of a hash; in this case you additionally want it keyed on the ID, so you probably want to build a hash-of-hashes structure something like:
my %values_by_id; # for each line from the first file: my($id1, $values_string1) = split /:/, $line1; for my $value1 (split /~~/, $values_string1) { # set each unique value to be true in the hash $values_by_id{$id1}{$value1} = 1; } # and then for a line from the second file: my($id2, $values_string2) = split /:/, $line2; my @shared_values; for my $value2 (split /~~/, $values_string2) { push @shared_values, $value2 if $values_by_id{$id2}{$values2}; }
Hope this helps.
In reply to Re: Find common substrings
by hv
in thread Find common substrings
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |