in reply to Re^2: Regular expression Problem
in thread Regular expression Problem
outputs#!/usr/bin/perl use strict; use warnings; use YAPE::Regex::Explain; my $re = qr/(%TABLE{.*?name\=\"History[^"]*"[^}]*}%\s*(\|[^\|]*){3}\|\ +s)((\|[^\|]*){3}\|\s)*/o; print YAPE::Regex::Explain->new($re)->explain();
The match and the substitution have identical bodies, so they will match the same thing. Let us know if the above is unclear.The regular expression: (?-imsx:(%TABLE{.*?name="History[^"]*"[^}]*}%\s*(\|[^\|]*){3}\|\s)((\| +[^\|]*){3}\|\s)*) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- %TABLE{ '%TABLE{' ---------------------------------------------------------------------- .*? any character except \n (0 or more times (matching the least amount possible)) ---------------------------------------------------------------------- name="History 'name="History' ---------------------------------------------------------------------- [^"]* any character except: '"' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- " '"' ---------------------------------------------------------------------- [^}]* any character except: '}' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- }% '}%' ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ( group and capture to \2 (3 times): ---------------------------------------------------------------------- \| '|' ---------------------------------------------------------------------- [^\|]* any character except: '\|' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ){3} end of \2 (NOTE: because you are using a quantifier on this capture, only the LAST repetition of the captured pattern will be stored in \2) ---------------------------------------------------------------------- \| '|' ---------------------------------------------------------------------- \s whitespace (\n, \r, \t, \f, and " ") ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- ( group and capture to \3 (0 or more times (matching the most amount possible)): ---------------------------------------------------------------------- ( group and capture to \4 (3 times): ---------------------------------------------------------------------- \| '|' ---------------------------------------------------------------------- [^\|]* any character except: '\|' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ){3} end of \4 (NOTE: because you are using a quantifier on this capture, only the LAST repetition of the captured pattern will be stored in \4) ---------------------------------------------------------------------- \| '|' ---------------------------------------------------------------------- \s whitespace (\n, \r, \t, \f, and " ") ---------------------------------------------------------------------- )* end of \3 (NOTE: because you are using a quantifier on this capture, only the LAST repetition of the captured pattern will be stored in \3) ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------
As JavaFan points out below, you should likely omit the o and e modifiers. bart and AnomalousMonk's discussion below is also very useful - You can likely have wholly equivalent behavior in your code while having only one instance of this long and fragile regex.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Regular expression Problem
by archimca (Novice) on Jan 18, 2011 at 14:40 UTC | |
by kennethk (Abbot) on Jan 18, 2011 at 16:00 UTC |