Understanding Regex

hareesh has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Understanding Regex by Discipulus (Canon) on Nov 18, 2015 at 07:53 UTC
they are infact different: when playing with regexes i always suggest two tools and the use of YAPE::Regex::Explain perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$A +RGV[0]/)->explain();" ^\s[^\s\=]+\s= The regular expression: (?-imsx:\s[\s\=]+\s=) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- [\s\=]+ any character of: whitespace (\n, \r, \t, \f, and " "), '\', '=' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- = '=' ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$A +RGV[0]/)->explain();" ^\s\w+\s= The regular expression: (?-imsx:\s\w+\s=) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- = '=' ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- [download] HtH L* UPDATE: see the wise advice from Lotus1 and my next post: `^` was vaporized by the command line processor. L* There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l] [select]
Re^2: Understanding Regex by Lotus1 (Vicar) on Nov 18, 2015 at 16:19 UTC
It looks like you missed the caret, '^', inside the character group when you ran the explain function the first time. It should be `any character except: whitespace...`	[reply] [d/l]
Re^3: Understanding Regex by Discipulus (Canon) on Nov 19, 2015 at 07:54 UTC
good spotted an thanks Lotus1 was just partially my fault: the code was correct but, as i work on an unfriendly OS, i need to put double quotes around the argument. So the actual command line became: `perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$A +RGV[0]/)->explain();" "^\s[^\s\=]+\s="` [download] with the right output: Read more... (2 kB) and the second one `perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$ARGV[0]/)->explain();" "^\s\w+\s="` with his right output: Read more... (2 kB) For the OP the best was probably to highlight only the different part: perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$A +RGV[0]/)->explain();" "[^\s\=]" .. ---------------------------------------------------------------------- [^\s\=] any character except: whitespace (\n, \r, \t, \f, and " "), '\', '=' ---------------------------------------------------------------------- perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$A +RGV[0]/)->explain();" "\w" .. ---------------------------------------------------------------------- \w word characters (a-z, A-Z, 0-9, _) ---------------------------------------------------------------------- [download] L* There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l] [select]
Re: Understanding Regex by Anonymous Monk on Nov 18, 2015 at 08:02 UTC
Both regex are very basic , both are covered in perlintro your regex limits the "key" to word characters, his regex only forbids whitespace and literal *= There is a principle, Be liberal in what you accept, and conservative in what you send	[reply]