http://qs1969.pair.com?node_id=326526


in reply to regex logical equivalence?

Replacing all the * with ? is not necessarily what you want to do. For example \d* will match 1234567, 123, 1, etc... \d? will only match 1 in the previous example. It will also match nothing. (since the ? means optional)

If you are talking about clarity and maintenance, I would go for the first one because of conciseness. However, I would encourage you to rewrite it. The second one has way too many options and will require a LOT of backtracking if it doesn't match the first time.

While I don't know your exact setup, I can try and help guide you to rewriting this horrible mess of a regex. First, do you need to match multiple square brackets at the beginning? Odds are, you only need one... is it optional? If it's like my prompt, I only have one and it's always there so I could just put a \ infront of it (without the *) if it is optional, you could append it with "?". Now to \w*... can it be nothing? If not, it's probably better to use \w+ (one or more word characters) \@* can you have @@@@@? if not, I'd use \@? What might be better at the beginning would be something like /[\[\w\@\-]+[$\s#\%>~]\]/ Other things to think about are spaces and other characters that may be in the prompt. Also if you are not capturing the prompt, you can try replacing the ASCII escape sequences with // (nothing) before you parse it to help clean it up a bit.

One final note, * isn't "evil" it's just used improperly *a lot*. If you find yourself using it, think... "is there another way I can do this?" Usually there is.