in reply to Re: Finding pattern in a file
in thread Finding pattern in a file

But IIUC, the pattern to be searched for may be broken across multiple lines. How would your code handle, e.g., the record:

>AAF88103.1 zinc finger protein 226 [Homo sapiens] AAAAAAAAAAAAAACDECGKEFSQ GAHLQTHQKVHZZZZZZZZZZZ
(assuming we're now dealing with kosher FASTA files)?.


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^3: Finding pattern in a file
by tybalt89 (Monsignor) on Apr 14, 2020 at 23:08 UTC

    Here's one way (snicker).
    It shows the location of each occurrence, even if they are overlapping.

    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11115501 use warnings; $_ = do { local $/; <DATA> }; # or however you want to read the file my $pattern = 'CDECGKEFSQGAHLQTHQKVH' =~ s/\B/\n?/gr; print lc($`) . $1 . lc(substr $', length $1) . "\n" while /(?=($patter +n))/g; __DATA__ >AAF88103.1 zinc finger protein 226 [Homo sapiens] MNMFKEAVTFKDVAVAFTEEELGLLGPAXRKLYRDVMVENFRNLLSVGHPPFKQDVSPIERNEQLWIMTT ATRRQGNLGEKNQSKLITVQDRESEEELSCWQIWQQIANDLTRCQDSMINNSQCHKQGDFPYQVGTELSI QISEDENYIVNKADGPNNTGNPEFPILRTQDSWRKTFLTESQRLNRDQQISIKNKLCQCKKGVDPIGWIS HHDGHRVHKSEKSYRPNDYEKDNMKILTFDHNSMIHTGQKSYQCNECKKPFSDLSSFDLHQQLQSGEKSL TCVERGKGFCYSPVLPVHQKVHVGEKLKCDECGKEFSQGAHLQTHQKVHVIEKPYKCKQCGKGFSRRSAL NVHCKVHTAEKPYNCEECGRAFSQASHLQDHQRLHTGEKPFKCDACGKSFSRNSHLQSHQRVHTGEKPYK CEECGKGFICSSNLYIHQRVHTGEKPYKCEECGKGFSRPSSLQAHQGVHTGEKSYICTVCGKGFTLSSNL QAHQRVHTGEKPYKCNECGKSFRRNSHYQVHLVVHTGEKPYKCEICGKGFSQSSYLQIHQKAHSIEKPFK CEECGQGFNQSSRLQIHQLIHTGEKPYKCEECGKGFSRRADLKIHCRIHTGEKPYNCEECGKVFRQASNL LAHQRVHSGEKPFKCEECGKSFGRSAHLQAHQKVHTGDKPYKCDECGKGFKWSLNLDMHQRVHTGEKPYK CGECGKYFSQASSLQLHQSVHTGEKPYKCDVCGKVFSRSSQLQSHQRVHTGEKPYKCEICGKSFSWRSNL TVHHRIHVGDKSYKSNRGGKNIRESTQEKKSIK. AAAAAAAAAAAAAACDECGKEFSQ GAHLQTHQKVHZZZZZZZZZZZ