in reply to Re^2: Regexp matching on a multiline file: dealing with line breaks
in thread Regexp matching on a multiline file: dealing with line breaks

As Laurent_R says, this is an excellent strategy. Have a look at the entry for $INPUT_RECORD_SEPARATOR (usually spelled just $/) in perlvar. For example:

#! perl use strict; use warnings; my $target = 'kitten'; my $count = 0; $/ = ">Header\n"; { local $/ = ">Header\n"; while (my $string = <DATA>) { $string =~ s/\n//g; print "string is '$string'\n"; $count += () = $string =~ /\Q$target/g; } } print "The target string '$target' occurs $count times in the file\n"; __DATA__ >Header sushikitten ilovethekit tensushithe kittenisthe >Header sushikittAn ilovethekit tensushithe kittBnisthe

Output:

23:11 >perl 1474_SoPW.pl string is '>Header' string is 'sushikittenilovethekittensushithekittenisthe>Header' string is 'sushikittAnilovethekittensushithekittBnisthe' The target string 'kitten' occurs 4 times in the file 23:11 >

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Replies are listed 'Best First'.
Re^4: Regexp matching on a multiline file: dealing with line breaks
by BlueStarry (Novice) on Dec 06, 2015 at 14:02 UTC
    Thank you very very much i REALLY appreciate your help and dedication on my matter.
Re^4: Regexp matching on a multiline file: dealing with line breaks
by BlueStarry (Novice) on Dec 10, 2015 at 17:19 UTC
    can i ask you a question? I'm having trouble fitting your code to mine because in real life my ">Header" changes every time. It is something like />(.)+?\n/

    i've tried to put the regular expression inside $\ but it doesn't seem to work

    And also i need to save info from the header, and this complicates the stuff more.
      $/ and $\ are two different variables. Input ≠ output.

      Also, read $/:

      Remember: the value of $/ is a string, not a regex. awk has to be better for something. :-)
      What might work, though, is
      $/ = "\n>";

      You'll need to remove the rest of the header from the the block, though.

      ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

        Can you elaborate more on your last sentece please?

        What i want to achieve is:

        Reading the header, starts with >. Storing Header info in a csv file or something. Loading in memory all the chunk up to the next header in memory. Search for a complex regexp match, store the match and the position on a csv file. Loop.