abcd has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I have just started using a perl (and any language for that matter) a couple of days ago and am stuck on a problem. I am trying to find a word in a string and print the matched word along with 2 characters preceding and trailing the match. For eg. If searching for the word “perl” n the string 123perl456perl789perl10 the output should be 23perl45 56perl78 89perl10 Here is the code I have written: while ($string=~m/(.{2}perl.{2})/g) {print results "$1\n"} The problem is that I get 23perl45 and 89perl10 but not 56perl78. I am assuming that I need to somehow change the place where the g modifier resets to after the match but I dont know how to do that. Thanks Thanks for the replies everyone! That solved it. (Dont know why my posts lose all formatting and paragraphs though)

Replies are listed 'Best First'.
Re: Bgeinner regex question
by choroba (Cardinal) on Mar 27, 2016 at 19:04 UTC
    You're on the right track. The global matching starts at the position where the last match ended. You can use \G to mark the position, but you can't move it backwards. For overlapping matches, you need look around assertions:
    my $string = '123perl456perl789perl10'; while ($string =~ /(?<=(..))perl(?=(..))/g) { print "$1perl$2\n"; }

    Where (?<=...) means "is immediately preceded by", and (?=..) stands for "is immediately followed by". The look around assertions are not part of capturing (that's why you need to add capturing parentheses into the assertions), and they don't affect the position of the match.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: Bgeinner regex question
by AnomalousMonk (Archbishop) on Mar 27, 2016 at 19:23 UTC

    This is what I think of as the standard approach to this sort of problem: it's my reflexive First Thought whenever I hear or read "overlapping match":

    c:\@Work\Perl>perl -wMstrict -le "my $s = '123perl456perl789perl10'; ;; my @caps = $s =~ m{ (?= (.. perl ..)) }xmsg; ;; print qq{'$_'} for @caps; " '23perl45' '56perl78' '89perl10'
    It has the advantages of using only a single capture group, and of needing no look-behind, which has the limitation in Perl regex of being fixed width.

    Update: Please see perlre, perlretut (especially Looking ahead and looking behind), and perlrequick.


    Give a man a fish:  <%-{-{-{-<

Re: Bgeinner regex question
by Anonymous Monk on Mar 27, 2016 at 19:06 UTC
    Normally, regex matches don't overlap. Here's one article that explains how to solve it with a lookahead assertion: http://linuxshellaccount.blogspot.com/2008/09/finding-overlapping-matches-using-perls.html

    my $string = "123perl456perl789perl10"; while ($string=~m/(.{2}perl(?=(.{2})))/gp) { print "$1$2\n" }
Re: Bgeinner regex question
by AnomalousMonk (Archbishop) on Mar 28, 2016 at 15:16 UTC
    (Dont know why my posts lose all formatting and paragraphs though)

    That's because you haven't put any formatting into your posts! PerlMonks posts are HTML formatted by the poster (or by a janitor if you've been really naughty). Please see Writeup Formatting Tips and at the very least use  <p> ... </p> (paragraph) tags for text and  <c> ... </c> (code) tags for code/data/input/output.


    Give a man a fish:  <%-{-{-{-<