First off I think youv'e mistyped that code. Im pretty sure that both those regexes would error out. Assuming that they should actualy have the first unmatched "(" removed then the regexes dont match the explanation next to them, so assuming youve written them verbatim its not hard to see why you are confused. :-) Lets look at them again (ignoring whitspace in the description):
/(fred|wilma) (flintstone) \2/ # match "fred" or "wilma" and put them in bucket 1 # followed by "flintstone" and put that in bucket 2 # then match whatever is in bucket 2. Which in this # case must be "flintstone" so the regex is # equivelent to # /(fred|wilma) (flintstone) flintsone/ /(fred|wilma) (flintstone) \1/ # match "fred" or "wilma" and put them in bucket 1 # followed by "flintstone" and put that in bucket 2 # then match whatever is in bucket 1. Which in this # case could be "fred" or "wilma" so the regex is # equivelent to one of the following: # /(fred) (flintstone) fred/ # /(wilma) (flintstone) wilma/
In chapter 10 it goes back into memory variables using $1..whatever which I understand, but it no longer uses the numbers in \1/ or \2/ as it did earlier. Why is that?
The thing to remember with using backreferences is that unlike using the capture after the match has occured, the contents of the capture is used as part of the pattern, and is evaluated before the entire regex has completed. This means we can say match AXAZ or BXBZ as /(A|B)X\1Z/ instead of /(AXA|BXB)Z/. $1 is the pattern captured by the last successful match, not a capture from this match.
#!perl -l print $_="the thing that that thing does"; /(\w+) that/ and printf '$&=%-20s $1=%-10s %s',$&,$1,$/; /($1) (\w+)/ and printf '$&=%-20s $1=%-10s $2=%-10s %s',$&,$1,$2,$/ +; /$1 (\w+) (\1)/ and printf '$&=%-20s $1=%-10s $2=%-10s %s',$&,$1,$2,$/ +; __END__ the thing that that thing does $&=thing that $1=thing $&=thing that $1=thing $2=that $&=thing that that $1=that $2=that
As we can see, first me do a "normal" match. We grab the word in from of "that" and put it in $1. We then match gain, but now we are going to find the word following what is now in $1 ("thing"), and just to be crafty we put whatever matches $1 into the capture bucket 1, the word that follows it goes into bucket 2. We then match again, this time we match the contents of $1 followed by a word which we capture into bucket 1, and then we capture whatever is in bucket one again and put it into bucket 2. So you can use $1 and \1 in the same regex and they mean _very_ different things.
So a capture from the previous match can be used in a new match. But we also need to be able to talk about captures from this match and use them in the pattern as well. So basically the \1 etc are captures from this match before the entire match is completed (it may fail later on in the pattern but by that time the \1 will mean whatever _might_ have been captured had the overall pattern succeded).
HTH
In reply to Re: memory variables (chp. 9 & 10 Learning Perl 3)
by demerphq
in thread memory variables (chp. 9 & 10 Learning Perl 3)
by sulfericacid
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |