Re^3: Regexp matching on a multiline file: dealing with line breaks

As Laurent_R says, this is an excellent strategy. Have a look at the entry for $INPUT_RECORD_SEPARATOR (usually spelled just $/) in perlvar. For example:

#! perl
use strict;
use warnings;

my $target = 'kitten';
my $count  =  0;

$/ = ">Header\n";

{
    local $/ = ">Header\n";

    while (my $string = <DATA>)
    {
        $string =~ s/\n//g;
        print "string is '$string'\n";
        $count += () = $string =~ /\Q$target/g;
    }
}

print "The target string '$target' occurs $count times in the file\n";

__DATA__
>Header
sushikitten
ilovethekit
tensushithe
kittenisthe
>Header
sushikittAn
ilovethekit
tensushithe
kittBnisthe
[download]

Output:

23:11 >perl 1474_SoPW.pl
string is '>Header'
string is 'sushikittenilovethekittensushithekittenisthe>Header'
string is 'sushikittAnilovethekittensushithekittBnisthe'
The target string 'kitten' occurs 4 times in the file

23:11 >
[download]

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

Comment on Re^3: Regexp matching on a multiline file: dealing with line breaks Select or Download Code

Replies are listed 'Best First'.
Re^4: Regexp matching on a multiline file: dealing with line breaks by BlueStarry (Novice) on Dec 06, 2015 at 14:02 UTC
Thank you very very much i REALLY appreciate your help and dedication on my matter.	[reply]
Re^4: Regexp matching on a multiline file: dealing with line breaks by BlueStarry (Novice) on Dec 10, 2015 at 17:19 UTC
can i ask you a question? I'm having trouble fitting your code to mine because in real life my ">Header" changes every time. It is something like />(.)+?\n/ i've tried to put the regular expression inside $\ but it doesn't seem to work And also i need to save info from the header, and this complicates the stuff more.	[reply]
Re^5: Regexp matching on a multiline file: dealing with line breaks by choroba (Cardinal) on Dec 10, 2015 at 17:26 UTC
`$/` and `$\` are two different variables. Input ≠ output. Also, read $/: Remember: the value of $/ is a string, not a regex. awk has to be better for something. :-) What might work, though, is `$/ = "\n>";` [download] You'll need to remove the rest of the header from the the block, though. ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re^6: Regexp matching on a multiline file: dealing with line breaks by Anonymous Monk on Dec 10, 2015 at 21:55 UTC
Can you elaborate more on your last sentece please? What i want to achieve is: Reading the header, starts with >. Storing Header info in a csv file or something. Loading in memory all the chunk up to the next header in memory. Search for a complex regexp match, store the match and the position on a csv file. Loop.	[reply]
Re^7: Regexp matching on a multiline file: dealing with line breaks by Athanasius (Cardinal) on Dec 11, 2015 at 09:58 UTC
Re^7: Regexp matching on a multiline file: dealing with line breaks by choroba (Cardinal) on Dec 10, 2015 at 22:08 UTC