goff has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone! i haven't used Perl in a couple of years so there's probably a really obvious answer to this that i'm missing...

what i'm trying to do is match occurances like 'test zzzzzzzzzzzzzzz test' in a sample string and print them out. the first match prints, but none others - any clues as to why?

(oh i'm running this on windows xp, using activestate perl if that helps)
thanks!

#!perl my $text = "test1 zzzzzzzzzzzzzzz test2 test3 test4 test5 zzzzzzzzzzzz +zzz test6 test7 zzzzzzzzzzzzzzz test8 test9 test10"; # search for a-z0-9_ followed by whitespace then zzzzzzzzz then whites +pace then a-z0-9_ if ( $text =~ /([a-zA-Z0-9]+\s+[z]{15}\s+[a-z0-9]+)/ ) { print "string \$1: " . $1 . "!\n"; print "string \$2: " . $2 . "!\n"; } else { print "string not found...!\n"; } print "text is " . $text . "\n";
output is
C:\perlcode>perl regex.pl

string $1: test1 zzzzzzzzzzzzzzz test2!

string $2: !

text is : test1 zzzzzzzzzzzzzzz test2 test3 test4 test5 zzzzzzzzzzzzzzz test6 test7 zzzzzzzzzzzzzzz test8 test9 test10

Replies are listed 'Best First'.
Re: regex question
by bobf (Monsignor) on Jul 22, 2006 at 19:56 UTC

    You're close. I added strict and warnings, moved the regex into a while loop, added the /g modifier to get all matches, and deleted $2 since you only have one set of capturing parentheses (see perlre). I didn't touch the regex itself, but please note that \w matches "word" characters (alphanumeric plus "_"). If this is sufficient, you could improve the readability of the regex by using it instead of the character classes.

    use strict; use warnings; my $text = "test1 zzzzzzzzzzzzzzz test2 test3 test4 test5 zzzzzzzzzzzz +zzz test6 test7 zzzzzzzzzzzzzzz test8 test9 test10"; while( $text =~ /([a-zA-Z0-9]+\s+[z]{15}\s+[a-z0-9]+)/g ) { print "string: " . $1 . "!\n"; }
    Output:
    string: test1 zzzzzzzzzzzzzzz test2! string: test5 zzzzzzzzzzzzzzz test6! string: test7 zzzzzzzzzzzzzzz test8!

Re: regex question
by Hue-Bond (Priest) on Jul 22, 2006 at 19:55 UTC

    You're missing the /g modifier. You have to change the if to a while too, in order to be able to match the same string more than one time:

    my $text = "test1 zzzzzzzzzzzzzzz test2 test3 test4 test5 zzzzzzzzzzzz +zzz test6 test7 zzzzzzzzzzzzzzz test8 test9 test10"; while ($text =~ /([a-zA-Z0-9]+\s+[z]{15}\s+[a-z0-9]+)/g) { print "\$1 <$1>\n"; } __END__ $1 <test1 zzzzzzzzzzzzzzz test2> $1 <test5 zzzzzzzzzzzzzzz test6> $1 <test7 zzzzzzzzzzzzzzz test8>

    --
    David Serrano

Re: regex question
by GrandFather (Saint) on Jul 22, 2006 at 20:48 UTC

    Or if you want to grab all the matches in an array you can:

    use strict; use warnings; my $text = "test1 zzzzzzzzzzzzzzz test2 test3 test4 test5 zzzzzzzzzzzz +zzz test6 test7 zzzzzzzzzzzzzzz test8 test9 test10"; my @matches = $text =~ /([a-zA-Z0-9]+\s+[z]{15}\s+[a-z0-9]+)/g; print "string: $_!\n" for @matches;

    Prints:

    string: test1 zzzzzzzzzzzzzzz test2! string: test5 zzzzzzzzzzzzzzz test6! string: test7 zzzzzzzzzzzzzzz test8!

    Note that in this case the regex is evaluated in list context so it generates the list of captures as the result. bobf's version evaluates the regex in scalar context and generates a success/fail result and (because of the /g switch) stops on successive matches until all matches are found. Note that bobf's version retreives the capture text from the capture variable ($1 in this case) whereas the list version gets the captures put into the list. Consider:

    ... my @cap = $text =~ /([a-zA-Z0-9]+)\s+[z]{15}\s+([a-z0-9]+)/g; while (@cap) { print "string: $cap[0] ... $cap[1]\n"; splice @cap, 0, 2; }

    which prints:

    string: test1 ... test2 string: test5 ... test6 string: test7 ... test8

    Note the two capture groups in the regex now. To generate the output shown in this case bobf's print line would become:

    print "string: $1 ... $2\n";

    DWIM is Perl's answer to Gödel
Re: regex question
by swampyankee (Parson) on Jul 22, 2006 at 20:13 UTC

    "If" doesn't loop, so when the test returns true, it prints $1. You'll need to replace the if with a looping construct or (my preference) use the regex in list context:

    @text = ( $text =~ /([a-zA-Z0-9]+\s+[z]{15}\s+[a-z0-9]+)/g) if(@text){ print "first: " . shift(@text) . "\n" if (@text); print "second: " . shift(@text) . "\n" if (@text); } else{ print "Search failed\n"; }

    emc

    Login incorrect.
    Only perfect spellers may
    enter this system.

    Jason Axley