Before I get into demonstrating how to use
YAPE::Regex::Explain I need to point out a few mistakes you're consistantly making.
.* is rarely necessary at the beginning of a RE. You probably don't need it as the first token of your RE's unless you are later using $&, or unless you're wrapping it in parens and using a $1 (etc) capturing variable. See Death to Dot Star for additional reading on this subject.
? is not a non-greedy substitute for *. *? is the nongreedy zero-or-more quantifier.
The * quantifier will allow empty strings to match. In other words, "\w*" will match one, two, hundreds, thousands of word characters, but it will also match no characters at all. Is this what you want? Maybe you want the + quantifier instead.
Ok, here we go again with the deciphering. This time I'm not going to do it by hand, but rather will demonstrate effective use of a great module:
use strict;
use warnings;
use YAPE::Regex::Explain;
#my $exp = YAPE::Regex::Explain->new($REx)->explain;
my $rex1 = qr/.*(\[*\w*\@*\-*\w*[$ #\%>~]\]|\\\[\\e\[0m\\\] \[0m)\s?/;
my $rex2 = qr/.*([$ #\%>~]|\[*\w*\@*\-*\w*\%\]*|\[*\w*\@*\-*\w*#\]*|\[
+*\w*\@*\-*\w*\$\]*|\[*\w*\@*\-*\w*>\]*|\\\[\\e\[0m\\\] \[0m)\s?/;
print YAPE::Regex::Explain->new($rex1)->explain;
print YAPE::Regex::Explain->new($rex2)->explain;
__OUTPUT__
The regular expression:
(?-imsx:.*(\[*\w*\@*\-*\w*[$ #%>~]\]|\\\[\\e\[0m\\\] \[0m)\s?)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
\[* '[' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
\@* '@' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\-* '-' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
[$ #%>~] any character of: '$', ' ', '#', '%',
'>', '~'
----------------------------------------------------------------------
\] ']'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
\\ '\'
----------------------------------------------------------------------
\[ '['
----------------------------------------------------------------------
\\ '\'
----------------------------------------------------------------------
e 'e'
----------------------------------------------------------------------
\[ '['
----------------------------------------------------------------------
0m '0m'
----------------------------------------------------------------------
\\ '\'
----------------------------------------------------------------------
\] ']'
----------------------------------------------------------------------
' '
----------------------------------------------------------------------
\[ '['
----------------------------------------------------------------------
0m '0m'
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
\s? whitespace (\n, \r, \t, \f, and " ")
(optional (matching the most amount
possible))
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
The regular expression:
(?-imsx:.*([$ #%>~]|\[*\w*\@*\-*\w*%\]*|\[*\w*\@*\-*\w*#\]*|\[*\w*\@*\
+-*\w*\$\]*|\[*\w*\@*\-*\w*>\]*|\\\[\\e\[0m\\\] \[0m)\s?)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
[$ #%>~] any character of: '$', ' ', '#', '%',
'>', '~'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
\[* '[' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
\@* '@' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\-* '-' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
% '%'
----------------------------------------------------------------------
\]* ']' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
\[* '[' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
\@* '@' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\-* '-' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
# '#'
----------------------------------------------------------------------
\]* ']' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
\[* '[' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
\@* '@' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\-* '-' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
\$ '$'
----------------------------------------------------------------------
\]* ']' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
\[* '[' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
\@* '@' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\-* '-' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
> '>'
----------------------------------------------------------------------
\]* ']' (0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
\\ '\'
----------------------------------------------------------------------
\[ '['
----------------------------------------------------------------------
\\ '\'
----------------------------------------------------------------------
e 'e'
----------------------------------------------------------------------
\[ '['
----------------------------------------------------------------------
0m '0m'
----------------------------------------------------------------------
\\ '\'
----------------------------------------------------------------------
\] ']'
----------------------------------------------------------------------
' '
----------------------------------------------------------------------
\[ '['
----------------------------------------------------------------------
0m '0m'
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
\s? whitespace (\n, \r, \t, \f, and " ")
(optional (matching the most amount
possible))
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
It doesn't look to me like they're completely equivilant.
Also, please take an hour or so and read through perlrequick, perlretut and perlre. Until you've devoured those POD's you're going to be grasping at straws with regular expressions. If you really want to learn them inside and out, beg, buy, borrow, or steal (ok, don't steal) the Owls book, by Jeffrey Friedl, Mastering Regular Expressions. It's an O'Reilly book, and probably the best book ever written on regexps.
Updated: Added link, suggested by broquaint.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.