in reply to Re^5: Regexp matching on a multiline file: dealing with line breaks
in thread Regexp matching on a multiline file: dealing with line breaks

Can you elaborate more on your last sentece please?

What i want to achieve is:

Reading the header, starts with >. Storing Header info in a csv file or something. Loading in memory all the chunk up to the next header in memory. Search for a complex regexp match, store the match and the position on a csv file. Loop.

  • Comment on Re^6: Regexp matching on a multiline file: dealing with line breaks

Replies are listed 'Best First'.
Re^7: Regexp matching on a multiline file: dealing with line breaks
by Athanasius (Archbishop) on Dec 11, 2015 at 09:58 UTC

    I think choroba is proposing a solution along these lines:

    #! perl use strict; use warnings; my $target = 'kitten'; my $count = 0; { local $/ = "\n>"; my $first = 1; while (my $string = <DATA>) { if ($first) { next unless $string =~ /^>/; } (my $header, $string) = split /\n/, $string, 2; printf "Header: '%s%s'\n", ($first ? '' : '>'), $header; $string =~ s/\n//g; print "string is '$string'\n"; $count += () = $string =~ /\Q$target/g; } continue { $first = 0; } } print "The target string '$target' occurs $count times in the file\n"; __DATA__ not a header kittens >Header1 sushikitten ilovethekit tensushithe kittenisthe >Header2 sushikittAn ilovethekit tensushithe kittBnisthe

    Output:

    19:57 >perl 1474a_SoPW.pl Header: '>Header1' string is 'sushikittenilovethekittensushithekittenisthe>' Header: '>Header2' string is 'sushikittAnilovethekittensushithekittBnisthe' The target string 'kitten' occurs 4 times in the file 19:57 >

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re^7: Regexp matching on a multiline file: dealing with line breaks
by choroba (Cardinal) on Dec 10, 2015 at 22:08 UTC
    Oh, sorry, I lost track, I thought you were interested in the data, not the header.
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,