Hello BlueStarry, and welcome to the Monastery!
If the entire file will fit in memory, a variation on kennethk’s solution is to simply delete the newlines before searching:
#! perl use strict; use warnings; my $target = 'kitten'; my $string = do { local $/; <DATA>; }; $string =~ s/\n//g; my $count = () = $string =~ /\Q$target/g; print "The target string '$target' occurs $count times in the file\n"; __DATA__ sushikitten ilovethekit tensushithe kittenisthe
Output:
14:28 >perl 1474_SoPW.pl The target string 'kitten' occurs 3 times in the file 14:28 >
However, as your input file is 5 GB, this approach is probably impractical. In which case you’re going to have to bite the bullet and implement a solution with “strange buffers” — such as a sliding window technique. Maybe have a look at Data::Iterator::SlidingWindow.
Hope that helps,
| Athanasius <°(((>< contra mundum | Iustus alius egestas vitae, eros Piratica, |
In reply to Re: Regexp matching on a multiline file: dealing with line breaks
by Athanasius
in thread Regexp matching on a multiline file: dealing with line breaks
by BlueStarry
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |