hareesh has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I am a new bee to PERL. Would be happy if some one help me to clear my confusion regarding Regex. I have a PERL script by a senior who is no more in our group.

To match a string some thing like this

   sbcd  = 1.3456

he used following regex

 /^\s*[^\s\*=]+\s*=/

But I could not understand why he had to write such a complex regex for that since the following regex could have done that job

 /^\s*\w+\s*=/ any insights from any one ? Thanks..

Replies are listed 'Best First'.
Re: Understanding Regex
by Discipulus (Canon) on Nov 18, 2015 at 07:53 UTC
    they are infact different: when playing with regexes i always suggest two tools and the use of YAPE::Regex::Explain
    perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$A +RGV[0]/)->explain();" ^\s*[^\s\*=]+\s*= The regular expression: (?-imsx:\s*[\s\*=]+\s*=) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- [\s\*=]+ any character of: whitespace (\n, \r, \t, \f, and " "), '\*', '=' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- = '=' ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$A +RGV[0]/)->explain();" ^\s*\w+\s*= The regular expression: (?-imsx:\s*\w+\s*=) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- = '=' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------


    HtH
    L*
    UPDATE: see the wise advice from Lotus1 and my next post: ^ was vaporized by the command line processor.

    L*
    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

      It looks like you missed the caret, '^', inside the character group when you ran the explain function the first time. It should be any character except: whitespace...

        good spotted an thanks Lotus1
        was just partially my fault: the code was correct but, as i work on an unfriendly OS, i need to put double quotes around the argument.
        So the actual command line became:
        perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$A +RGV[0]/)->explain();" "^\s*[^\s\*=]+\s*="
        with the right output: and the second one perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$ARGV[0]/)->explain();" "^\s*\w+\s*=" with his right output: For the OP the best was probably to highlight only the different part:
        perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$A +RGV[0]/)->explain();" "[^\s\*=]" .. ---------------------------------------------------------------------- [^\s\*=] any character except: whitespace (\n, \r, \t, \f, and " "), '\*', '=' ---------------------------------------------------------------------- perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/$A +RGV[0]/)->explain();" "\w" .. ---------------------------------------------------------------------- \w word characters (a-z, A-Z, 0-9, _) ----------------------------------------------------------------------


        L*

        There are no rules, there are no thumbs..
        Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Understanding Regex
by Anonymous Monk on Nov 18, 2015 at 08:02 UTC