I think you might find it worthwhile to learn about /x at this point as these regular expressions could certainly do with some commenting. /x isn't hard or scary at all. All you have to do is rememeber to escape the whitespace you want and the #s. It makes regular expressions much easier to explain.
Just to make things more confusing ;) I'm going to swap the order of these two expressions, so my first one will be the longer of the two and I'll work on the the shorter (my second - your first) as I think that's the one you wanted to focus on.
To determine if your two regular expressions are suffiently equivalent we need to compare them.
This is the longer one:
/.* # Stuff
( # START capturing to $1
[$\ \#\%>~] # Any single space, $, #, %, > or ~
| # OR
\[* # 0 or more [s
\w* # 0 or more word characters (a-zA-Z0-9_)
\@* # 0 or more @s
\-* # 0 or more -s
\w* # more word characters
\% # Exactly 1 %
\]* # 0 or more ]s
| # OR
\[*\w*\@*\-*\w*\#\]* # As above, but with a # instead of %
| # OR
\[*\w*\@*\-*\w*\$\]* # As above, but with a $
| # OR
\[*\w*\@*\-*\w*>\]* # As above, but with no terminator
# (will therefore match any terminator)
| # OR
\\\[\\e\[0m\\\]\ \[0m # the sequence: \[\e[0m\\] [0m
) # END of $1
\s? # 0 or 1 spaces
/x
and this the shorter:
/.* # Stuff
( # START capturing to $1
\[* # 0 or more [s
\w* # 0 or more word characters
\@* # 0 or more @s
\-* # 0 or more -s
\w* # more word characters
[$\ \#\%>~] # exactly 1 space, $, #, %, > or ~
\] # exactly 1 ] (are you missing a * ?)
| # OR
\\\[\\e\[0m\\\]\ \[0m # the sequence: \[\e[0m\\] [0m
) # END of $1
\s? # 0 or 1 spaces
/x
Now we need to consider what patterns will match one, but not the other... (I'm going to assume you are missing a * up there next to your ], if not, then these aren't very equivalent at all).
- Any 1 space, $, #, %, > or ~ will be matched by both.
- The escape sequence: \[\e[0m\\] [0m is allowed by both.
- Each pattern: [w@-w$], [w@-w#], [w@-w%], [w@-w~] is allowed by both.
- [w@-w ] is (as you shown) is allowed by the second but not the first (this is easy to fix)
Like you, I can only spot this one significant difference between the two regular expressions (once you fix your typo).
This is easily fixed:
/.* # Stuff
( # START capturing to $1
\ # exactly 1 space
| # OR
\[* # 0 or more [s
\w* # 0 or more word characters
\@* # 0 or more @s
\-* # 0 or more -s
\w* # more word characters
[$\#\%>~] # exactly 1 of $, #, %, > or ~
\]* # 0 or more ]s
| # OR
\\\[\\e\[0m\\\]\ \[0m # the sequence: \[\e[0m\\] [0m
) # END of $1
\s? # 0 or 1 spaces
/x
Note that this equivalence won't necessarily remain true if you change your quantifiers. In particular if you change all of your *s to ?s. If you want my opinion I suspect you're actually looking more for a regular expression like this:
/.* # Stuff
( # START capturing to $1
\ # exactly 1 space
| # OR
\[? # 0 or 1 [
\w* # 0 or more word characters
\@? # 0 or 1 @
[-\w.]* # 0 or more word chars, dots and hyphens eg
+w-w.w-.w
[$\#\%>~] # exactly 1 of $, #, %, > or ~
\]? # 0 or 1 ]
| # OR
\\\[\\e\[0m\\\]\ \[0m # the sequence: \[\e[0m\\] [0m
) # END of $1
\s? # 0 or 1 spaces
/x
But I may be wrong - you may not be interested in the dot at all. ;) I'm not 100% certain that you want the .* at the front though. Do you have some sample data for us?
I hope you recognise that both expressions will match any string with a single space in it... which will be most strings....
I hope this helps.
jarich
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.