note
jarich
I think you might find it worthwhile to learn about /x at this point as these regular expressions could certainly do with some commenting. /x isn't hard or scary at all. All you have to do is rememeber to escape the whitespace you want and the #s. It makes regular expressions much easier to explain.
<p>
Just to make things more confusing ;) I'm going to swap the order of these two expressions, so my first one will be the longer of the two and I'll work on the the shorter (my second - your first) as I think that's the one you wanted to focus on.
<p>
To determine if your two regular expressions are suffiently equivalent we need to compare them.
<readmore>
This is the longer one:
<p>
<code>
/.* # Stuff
( # START capturing to $1
[$\ \#\%>~] # Any single space, $, #, %, > or ~
| # OR
\[* # 0 or more [s
\w* # 0 or more word characters (a-zA-Z0-9_)
\@* # 0 or more @s
\-* # 0 or more -s
\w* # more word characters
\% # Exactly 1 %
\]* # 0 or more ]s
| # OR
\[*\w*\@*\-*\w*\#\]* # As above, but with a # instead of %
| # OR
\[*\w*\@*\-*\w*\$\]* # As above, but with a $
| # OR
\[*\w*\@*\-*\w*>\]* # As above, but with no terminator
# (will therefore match any terminator)
| # OR
\\\[\\e\[0m\\\]\ \[0m # the sequence: \[\e[0m\\] [0m
) # END of $1
\s? # 0 or 1 spaces
/x
</code>
and this the shorter:
<code>
/.* # Stuff
( # START capturing to $1
\[* # 0 or more [s
\w* # 0 or more word characters
\@* # 0 or more @s
\-* # 0 or more -s
\w* # more word characters
[$\ \#\%>~] # exactly 1 space, $, #, %, > or ~
\] # exactly 1 ] (are you missing a * ?)
| # OR
\\\[\\e\[0m\\\]\ \[0m # the sequence: \[\e[0m\\] [0m
) # END of $1
\s? # 0 or 1 spaces
/x
</code>
</readmore>
Now we need to consider what patterns will match one, but not the other... (I'm going to assume you are missing a * up there next to your ], if not, then these aren't very equivalent at all).
<ul>
<li>Any 1 space, $, #, %, > or ~ will be matched by both.</li>
<li>The escape sequence: <code>\[\e[0m\\] [0m</code> is allowed by both.</li>
<li>Each pattern: <code>[w@-w$], [w@-w#], [w@-w%], [w@-w~]</code> is allowed by both.</li>
<li><code>[w@-w ]</code> is (as you shown) is allowed by the second but not the first (this is easy to fix)</li>
</ul>
<p>
Like you, I can only spot this one significant difference between the two regular expressions (once you fix your typo).
<readmore>
This is easily fixed:
<code>
/.* # Stuff
( # START capturing to $1
\ # exactly 1 space
| # OR
\[* # 0 or more [s
\w* # 0 or more word characters
\@* # 0 or more @s
\-* # 0 or more -s
\w* # more word characters
[$\#\%>~] # exactly 1 of $, #, %, > or ~
\]* # 0 or more ]s
| # OR
\\\[\\e\[0m\\\]\ \[0m # the sequence: \[\e[0m\\] [0m
) # END of $1
\s? # 0 or 1 spaces
/x
</code>
</readmore>
Note that this equivalence won't necessarily remain true if you change your quantifiers. In particular if you change all of your *s to ?s. If you want my opinion I suspect you're actually looking more for a regular expression like this:
<code>
/.* # Stuff
( # START capturing to $1
\ # exactly 1 space
| # OR
\[? # 0 or 1 [
\w* # 0 or more word characters
\@? # 0 or 1 @
[-\w.]* # 0 or more word chars, dots and hyphens eg w-w.w-.w
[$\#\%>~] # exactly 1 of $, #, %, > or ~
\]? # 0 or 1 ]
| # OR
\\\[\\e\[0m\\\]\ \[0m # the sequence: \[\e[0m\\] [0m
) # END of $1
\s? # 0 or 1 spaces
/x
</code>
<p>
But I may be wrong - you may not be interested in the dot at all. ;) I'm not 100% certain that you want the .* at the front though. Do you have some sample data for us?
<p>
I hope you recognise that both expressions will match any string with a single space in it... which will be most strings....
<p>
I hope this helps.
<p>
jarich
326436
326436