Re: Searching for a word that may only exist in part

i don't see any way to do it besides eating letters from the back, then starting over from the front.

here is one way, though index() would surely be faster than m// here.

my $sequence = "GAATGTTTTAGCAATCTCTTTCTGTCATGAATCCATGGCAGTGACCATACTAAT
+GGTGACTGCCATTGATGGAGGGAGACACA";
my $find = "CTGGATAAGAATGTTTTAGCAATCTCTT";

my $found;

MATCH: {
    my $tail = $find;
    while ( length($tail) > 2 and not $found ) {
        ($found) = $sequence =~ /($tail)/  # find match
            or substr( $tail, 0, 1, '');   # or eat first letter

    }
    last MATCH if $found;

    my $head = $find;
    ## can chop first since exact match already failed
    while ( chop $head and length($head) > 2 and not $found ) {
        ($found) = $sequence =~ /($head)/;
    }
}

print "found? $found\n";
[download]

updated: to provide better(?) var names

Comment on Re: Searching for a word that may only exist in part Download Code

Replies are listed 'Best First'.
Re^2: Searching for a word that may only exist in part by GrandFather (Saint) on Oct 19, 2006 at 01:41 UTC
For: `my $sequence = "111...1111...11"; my $find = "11111";` [download] Prints: `found? 1111` [download] whereas the OP says "if I do not find the whole word within a sequence and start truncating the word, then it can only match at either end of a sequence and not within". DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re^3: Searching for a word that may only exist in part by mreece (Friar) on Oct 19, 2006 at 02:40 UTC
oh, i misunderstood, thinking only at the beginning or end of the 'find' sequence (1111 is the beginning (..and end..) of 11111). the regexes are fixable (`/(^$tail\|$tail$)/` and same for `$head`?) easily enough.. i just wanted an excuse to use chop ;-)	[reply] [d/l] [select]