Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Before I get into demonstrating how to use YAPE::Regex::Explain I need to point out a few mistakes you're consistantly making.

.* is rarely necessary at the beginning of a RE. You probably don't need it as the first token of your RE's unless you are later using $&, or unless you're wrapping it in parens and using a $1 (etc) capturing variable. See Death to Dot Star for additional reading on this subject.

? is not a non-greedy substitute for *. *? is the nongreedy zero-or-more quantifier.

The * quantifier will allow empty strings to match. In other words, "\w*" will match one, two, hundreds, thousands of word characters, but it will also match no characters at all. Is this what you want? Maybe you want the + quantifier instead.

Ok, here we go again with the deciphering. This time I'm not going to do it by hand, but rather will demonstrate effective use of a great module:

use strict; use warnings; use YAPE::Regex::Explain; #my $exp = YAPE::Regex::Explain->new($REx)->explain; my $rex1 = qr/.*(\[*\w*\@*\-*\w*[$ #\%>~]\]|\\\[\\e\[0m\\\] \[0m)\s?/; my $rex2 = qr/.*([$ #\%>~]|\[*\w*\@*\-*\w*\%\]*|\[*\w*\@*\-*\w*#\]*|\[ +*\w*\@*\-*\w*\$\]*|\[*\w*\@*\-*\w*>\]*|\\\[\\e\[0m\\\] \[0m)\s?/; print YAPE::Regex::Explain->new($rex1)->explain; print YAPE::Regex::Explain->new($rex2)->explain; __OUTPUT__
The regular expression: (?-imsx:.*(\[*\w*\@*\-*\w*[$ #%>~]\]|\\\[\\e\[0m\\\] \[0m)\s?) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- .* any character except \n (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- \[* '[' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \@* '@' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \-* '-' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- [$ #%>~] any character of: '$', ' ', '#', '%', '>', '~' ---------------------------------------------------------------------- \] ']' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- \\ '\' ---------------------------------------------------------------------- \[ '[' ---------------------------------------------------------------------- \\ '\' ---------------------------------------------------------------------- e 'e' ---------------------------------------------------------------------- \[ '[' ---------------------------------------------------------------------- 0m '0m' ---------------------------------------------------------------------- \\ '\' ---------------------------------------------------------------------- \] ']' ---------------------------------------------------------------------- ' ' ---------------------------------------------------------------------- \[ '[' ---------------------------------------------------------------------- 0m '0m' ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- \s? whitespace (\n, \r, \t, \f, and " ") (optional (matching the most amount possible)) ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- The regular expression: (?-imsx:.*([$ #%>~]|\[*\w*\@*\-*\w*%\]*|\[*\w*\@*\-*\w*#\]*|\[*\w*\@*\ +-*\w*\$\]*|\[*\w*\@*\-*\w*>\]*|\\\[\\e\[0m\\\] \[0m)\s?) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- .* any character except \n (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- [$ #%>~] any character of: '$', ' ', '#', '%', '>', '~' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- \[* '[' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \@* '@' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \-* '-' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- % '%' ---------------------------------------------------------------------- \]* ']' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- \[* '[' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \@* '@' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \-* '-' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- # '#' ---------------------------------------------------------------------- \]* ']' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- \[* '[' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \@* '@' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \-* '-' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \$ '$' ---------------------------------------------------------------------- \]* ']' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- \[* '[' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \@* '@' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \-* '-' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- > '>' ---------------------------------------------------------------------- \]* ']' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- \\ '\' ---------------------------------------------------------------------- \[ '[' ---------------------------------------------------------------------- \\ '\' ---------------------------------------------------------------------- e 'e' ---------------------------------------------------------------------- \[ '[' ---------------------------------------------------------------------- 0m '0m' ---------------------------------------------------------------------- \\ '\' ---------------------------------------------------------------------- \] ']' ---------------------------------------------------------------------- ' ' ---------------------------------------------------------------------- \[ '[' ---------------------------------------------------------------------- 0m '0m' ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- \s? whitespace (\n, \r, \t, \f, and " ") (optional (matching the most amount possible)) ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

It doesn't look to me like they're completely equivilant.

Also, please take an hour or so and read through perlrequick, perlretut and perlre. Until you've devoured those POD's you're going to be grasping at straws with regular expressions. If you really want to learn them inside and out, beg, buy, borrow, or steal (ok, don't steal) the Owls book, by Jeffrey Friedl, Mastering Regular Expressions. It's an O'Reilly book, and probably the best book ever written on regexps.

Updated: Added link, suggested by broquaint.


Dave


In reply to Re: regex logical equivalence? by davido
in thread regex logical equivalence? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (7)
As of 2022-08-11 14:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?