mdunnbass has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

Here's a quick question for you. I have looked at perlre, perlretut, and done some web-searching, and I am still unclear what to do.

If I have the following code:

$_ = 'TATATATATATA'; while (/TATA/g) { print "Matched 'TATA' at position ", pos, "\n"; }
The output I get is:
Matched 'TATA' at position 4 Matched 'TATA' at position 8 Matched 'TATA' at position 12
So, obviously, the /g leaves me positioned at the end of each successive match. But, if I wanted to also match at positions 6 and 10, would I need to respecify pos to be the beginning of the previous match +1 character? Is there a better, more straightforward way to do this? Am I just missing something simple?

Thanks
Matt

Replies are listed 'Best First'.
Re: position after global matches?
by BrowserUk (Patriarch) on May 29, 2007 at 16:32 UTC

    You can achieve that by using a lookahead or lookbehind assertions. The former will give you the start positions of the matches. The latter the ends (as you've requested).

    $_ = 'TATATATATATA'; while (/(?=TATA)/g) { print "Matched 'TATA' at position ", pos, "\n"; };; Matched 'TATA' at position 0 Matched 'TATA' at position 2 Matched 'TATA' at position 4 Matched 'TATA' at position 6 Matched 'TATA' at position 8 $_ = 'TATATATATATA'; while (/(?<=TATA)/g) { print "Matched 'TATA' at position ", pos, "\n"; } Matched 'TATA' at position 4 Matched 'TATA' at position 6 Matched 'TATA' at position 8 Matched 'TATA' at position 10 Matched 'TATA' at position 12

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: position after global matches?
by FunkyMonk (Bishop) on May 29, 2007 at 16:26 UTC
    Two methods spring to mind

    while (/TATA/g) { print "Matched 'TATA' at position ", pos, "\n"; pos() -= 2; }

    and

    while (/TA(?=TA)/g) { print "Matched 'TATA' at position ", pos, "\n"; }

    (?=TA) is a zero-width positive look-ahead assertion and is documented in perlre

    Both these techniques also match at 2 which might not be what you're after

Re: position after global matches?
by blazar (Canon) on May 29, 2007 at 16:28 UTC
    So, obviously, the /g leaves me positioned at the end of each successive match. But, if I wanted to also match at positions 6 and 10, would I need to respecify pos to be the beginning of the previous match +1 character? Is there a better, more straightforward way to do this? Am I just missing something simple?

    Judging from code like GrandFather's at finding number of contiguous letters, there may be other and somewhat more terse ways. But personally I'd do it like you suggested, and that would be clearer for me to understand. Of course It will be enough to step back by -2 on the current pos.

    Fortunately IIRC Perl 6 will provide simple single modifiers to do this kind of things out of the box. (In spite of those who claim it's "too much complexity". ;-)

Re: position after global matches?
by duff (Parson) on May 29, 2007 at 18:56 UTC

    Since I didn't see anyone mention this (now watch someone submits a node that does while I'm busy typing), if you don't really need a regular expression, you might be better off just using index

    #!/usr/bin/perl $_ = 'TATATATATATA'; $str = "TATA"; $p = -1; while (1) { $p = index($_,$str,$p+1); last if $p < 0; print "Found $str at position $p\n"; }