belg4mit has asked for the wisdom of the Perl Monks concerning the following question:

I apologize if the answer to this is obvious or to be found readily, however it would not seem to be. I searched QandASection: regular expressions, perlfunc, and Super Search; though it's limiting to 4+ characters made searching for the obvious impossible.

Brother smgfc asked in the chatterbox for help considering an infinite loop involving

# substr no good, this is a simplified regexp while( ($b, $c) = $a =~ /(.)(.)/g ){ #do stuff }
I surmised that we needed to track pos for ourselves. Fine. Printing it from within the loop works just fine. However, attempting to set a scalar to the value returned by pos, doesn't. Trying to be clever, I also tried to side-step the issue with
$p = pos($a) ? 1 : 0;
which proceeded to also do a whole lot of nothing. So I inquire, what and why is this bizarre behavior?

--
perl -pe "s/\b;([st])/'\1/mg"

Replies are listed 'Best First'.
(jeffa) Re: puzzled by pos
by jeffa (Bishop) on Mar 30, 2002 at 01:02 UTC
    Well, since i don't know the answer to your pos question, allow me to do another side step and solve the original while loop/regex combo.

    The problem is that in order for the while loop to finish, there has to be nothing left to match. Try this instead:

    my $a = '12345'; while($a =~ s/(.)(.)//) { my ($b,$c) = ($1,$2); print "$b - $c\n"; }
    But notice that '5' gets skipped. Better than getting caught in an infinite loop i suppose. I still recommend testing for an even lengthed string first. Another consequence is $a is completely destroyed if it is an even number of chars long, and contains the last char if it was odd lengthed.

    UPDATE:
    Your CB comment is absolutely correct - i was wrong in assuming that you have to have s/// to solve this one:

    while($a =~ /(.)(.)/g){ my ($b,$c) = ($1,$2); print "$b - $c\n"; }

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
Re: puzzled by pos
by zengargoyle (Deacon) on Mar 30, 2002 at 01:07 UTC

    Ditto on jeffa's response, but no need to resort to s//.

    perl -e ' $a="hello world!"; while ( $a=~/\G(.)(.)/g ) { ($b,$c)=($1,$2); print"$b:$c\n"; } ' h:e l:l o: w:o r:l d:!
      Yes yes :-), but the CB comment jeffa points out is that smgfc was aware of this (I got sidetracked). And that's all fine and dandy for solving the matter at hand, but it's not nice to avoid things simply because you don't understand them :-D.

      UPDATE: Of course a) jeffa had not made his update by the time you posted (kind of) and b) I left part of the story out in the original post.

      --
      perl -pe "s/\b;([st])/'\1/mg"

Re: puzzled by pos
by belg4mit (Prior) on Mar 30, 2002 at 02:50 UTC
    crazyinsomniac pointed me at japhy's book, which shows it's possible. For awhile it then seemed it could only be done with $_. (Though one could eschew the issue with @+.) Zaxo and crazyinsomniac then proceeded to set me straight. I'm not sure what was wrong when I played around with it earlier... But then again I have had odd things happen today such as forms on a page fill-in values and submit themselves...

    --
    perl -pe "s/\b;([st])/'\1/mg"

Re: puzzled by pos
by gmax (Abbot) on Mar 30, 2002 at 14:31 UTC
    jeffa's and zengargoyle's solutions both avoid an infinite loop, but miss the last character if the length is odd.
    Here's my try.
    #!/usr/bin/perl -w use strict; my $x="hello world"; # 11 chars while ( $x =~ /(.)(.)?/g ) { my ( $y, $z ) = ( $1, $2 ); $z = "" unless $z; # prevents the warning # "use of uninitialized ..." print"<$y:$z>\n"; } __END__ <h:e> <l:l> <o: > <w:o> <r:l> <d:>
    "hello world!" is 12 characters long, and it works fine. Removing the bang at the end, it becomes 11 and the normal loop will fail.
    But making the second match optional, the loop goes till the end, and the only other thing you need to do is to check if the second match is defined or not.

    I don't see the need of bothering with either pos or the inchworm method (/\G   /g) in this case.
    _ _ _ _ (_|| | |(_|>< _|
Re: puzzled by pos
by YuckFoo (Abbot) on Mar 30, 2002 at 22:03 UTC
    belg4mit,

    This appeared to me to be odd behavior too. I guess the answer is context. According to the Camel, Chapter 5 : Pattern Matching, /g in list context returns a list of all matches. Only /g in scalar context indicates a progressive match.

    So attempting to assign values in the while expression:

    while( ($b, $c) = $a =~ /(.)(.)/g ){

    caused list context, thus no progressive match and no pos.

    YuckFoo