BrassMonkey has asked for the wisdom of the Perl Monks concerning the following question:

Here is my text:

>>>cat temp.txt my dog is (red1) my cat is (blue1) my dog is (red2) my cat is (blue2) my dog is (red3) my cat is (blue3) my dog is (red4) my cat is (blue4) my dog is (red5) my cat is (blue5) my dog is (red6) my cat is (blue6)

Here is my code:

#!/usr/bin/perl $mystring = join("", <>); #print $&; print "-----------mystring--------------------------\n"; print $mystring; print "------------variables----------------------------\n"; if ($mystring =~ m/my.*?\((\S*?)\)/g) { #if ($mystring =~ s/my.*?\((\S*?)\)/xxxxxxx/g) { print "$1 \n"; print "-----------mystring again--------------------------\n"; print $mystring; print "-----------------------------------------\n"; }

Here is the output:

>>>>cat temp.txt | perl temp.pl -----------mystring-------------------------- my dog is (red1) my cat is (blue1) my dog is (red2) my cat is (blue2) my dog is (red3) my cat is (blue3) my dog is (red4) my cat is (blue4) my dog is (red5) my cat is (blue5) my dog is (red6) my cat is (blue6) ------------variables---------------------------- red1 -----------mystring again-------------------------- my dog is (red1) my cat is (blue1) my dog is (red2) my cat is (blue2) my dog is (red3) my cat is (blue3) my dog is (red4) my cat is (blue4) my dog is (red5) my cat is (blue5) my dog is (red6) my cat is (blue6) -----------------------------------------

What I can't figure out is how to print each one as it is matched. If I do a global replace, it replaces all of them (commented out that line). So, it is able to match each of them. I'm sure I'm missing an easy perl fuction or special variable. I've looked through Perl bookshelf and also in frequent questions of perlmonks.com


Basically I need to itterate over the matches. I tried a while statement and that doesn't work either.

Matching each line separate with a while <> won't work either. This is just a sample, but the pattern I'm really matching spans multiple lines. The join is necessary for that.

Replies are listed 'Best First'.
Re: variables in global match in regex
by davido (Cardinal) on Aug 11, 2004 at 16:07 UTC

    The /g modifier on the regexp, in scalar context, returns the next (in your case, the first) match. If you iterated over the regexp again, it would return the second match, etc.

    Thus, this sort of construct might be what you're looking for:

    while ( $mystring =~ /(regexp)/g ) { do_something_with($1); }

    Alternatively, evaluating the regexp with a /g modifier in list context will return a list of all matches without explicitly looping.

    if ( my( @matches ) = $mystring =~ m/(regexp)/g ) { do_something_with( @matches ); }

    Hope this helps. ...see perlrequick for a brief description of using /g in list and scalar context while capturing matches.

    Also, you could slurp the file instead of joining all the lines together by setting the special variable $/ to undef. Then you might use the /s modifier on your regexp in addition to the /g modifier if your regexp uses the . (dot) metacharacter and you want it to accept \n newlines the same as any other character. Just a thought. See perlvar for a description of $/.


    Dave

Re: variables in global match in regex
by ikegami (Patriarch) on Aug 11, 2004 at 16:09 UTC
    Does this help:
    while ($mystring =~ m/my.*?\((\S*?)\)/g) { print "$1 \n"; } output: red1 blue1 red2 blue2 red3 blue3 red4 blue4 red5 blue5 red6 blue6
Re: variables in global match in regex
by Roger (Parson) on Aug 11, 2004 at 16:14 UTC
    A simple trick is to do a dummy replacement, ie, replace the matching text with itself, while doing an evaluation on every single match with the 'e' modifier.
    #!/usr/bin/perl -w use strict; my $text = do { local $/; <DATA> }; $text =~ s/my.*?\((\w+)\)/print "$1\n"; $1/ge; __DATA__ my dog is (red1) my cat is (blue1) my dog is (red2) my cat is (blue2) my dog is (red3) my cat is (blue3) my dog is (red4) my cat is (blue4) my dog is (red5) my cat is (blue5) my dog is (red6) my cat is (blue6)

    And the output is
    red1 blue1 red2 blue2 red3 blue3 red4 blue4 red5 blue5 red6 blue6

Re: variables in global match in regex
by diotalevi (Canon) on Aug 11, 2004 at 16:24 UTC

    While you are at it, be sure to get into the habit of including the lines use strict; use warnings;. They'll make you do a little bit of work up front in code-cleanliness but they pay off in spades when they prevent a number of common bugs from occurring.

Re: variables in global match in regex
by BrassMonkey (Initiate) on Aug 11, 2004 at 16:18 UTC
    Thanks much for all the help. I swear I tried the while and it didn't work. I must have messed it up. Thanks much Monks, That funky monkey
Re: variables in global match in regex
by GreyGlass (Sexton) on Aug 11, 2004 at 19:48 UTC
    In order for m/<some stuff>/g to return multiple matches, you need to either evaluate it in a list context or repetitively evaluate the same match in scalar context.

    The scalar match can be done with with a while loop, as shown in other replies. Note that the same string must be matched, so if you used

    sub foo { $mystring } while (foo =~ m/blah../g) {

    , you would get the first match over and over, since each time you are matching on a different copy of the string. The list context match can be done with foreach for iteration, e.g.:

    $/=undef; for (<> =~ /\((.*?)\)/g) { print $1, "\n" }

    while _repeatedly_ evaluates its condition in scalar context, foreach repeatedly executes its body for values of the list obtained by evaluating the argument _once_ in list context. Btw, this means you do not want to use foreach if your list is way too big for your memory...