Hello bsherkhane,
The escape sequences \1, \2, \3, etc., are backreferences to captures in the current regex. The special variables $1, $2, $3, etc., are likewise backreferences to the captures in the most recent regex. $1 refers to the first capture, $2 to the second capture, and so on. Captures are numbered by counting left parentheses from the left. See perlre#Capture-groups.
The module YAPE::Regex::Explain is a useful tool for understanding regular expressions. Here is the explanation it gives for the left-hand side (i.e., the regex part) of the substitution in question:
#! perl use strict; use warnings; use YAPE::Regex::Explain; print YAPE::Regex::Explain->new ( qr{ \b ( (\d+) \s \S+ ) (.*?) \s \2 \s (\S+) }x )->explain();
Output:
17:26 >perl 1526_SoPW.pl The regular expression: (?x-ims: \b ( (\d+) \s \S+ ) (.*?) \s \2 \s (\S+) ) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?x-ims: group, but do not capture (disregarding whitespace and comments) (case-sensitive) (with ^ and $ matching normally) (with . not matching \n): ---------------------------------------------------------------------- \b the boundary between a word char (\w) and something that is not a word char ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- ( group and capture to \2: ---------------------------------------------------------------------- \d+ digits (0-9) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \2 ---------------------------------------------------------------------- \s whitespace (\n, \r, \t, \f, and " ") ---------------------------------------------------------------------- \S+ non-whitespace (all but \n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- ( group and capture to \3: ---------------------------------------------------------------------- .*? any character except \n (0 or more times (matching the least amount possible)) ---------------------------------------------------------------------- ) end of \3 ---------------------------------------------------------------------- \s whitespace (\n, \r, \t, \f, and " ") ---------------------------------------------------------------------- \2 what was matched by capture \2 ---------------------------------------------------------------------- \s whitespace (\n, \r, \t, \f, and " ") ---------------------------------------------------------------------- ( group and capture to \4: ---------------------------------------------------------------------- \S+ non-whitespace (all but \n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \4 ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- 17:26 >
Hope that helps,
| Athanasius <°(((>< contra mundum | Iustus alius egestas vitae, eros Piratica, |
In reply to Re^2: print identical keys once along with their values
by Athanasius
in thread print identical keys once along with their values
by bsherkhane
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |