Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!
I was reading a Perl posting, found this regular expression interesting, how could it be explained?
Here it is:
if ($data =~ /^\s*$/) { ..do this...} else{ ..that...}

Thanks!

Replies are listed 'Best First'.
Re: Explaining a Regular Expression
by toolic (Bishop) on Jul 06, 2010 at 15:04 UTC
    See also YAPE::Regex::Explain:
    use strict; use warnings; use YAPE::Regex::Explain; print YAPE::Regex::Explain->new('^\s*$')->explain(); __END__ The regular expression: (?-imsx:^\s*$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------
Re: Explaining a Regular Expression
by kennethk (Abbot) on Jul 06, 2010 at 15:02 UTC
    In general, the go-to sources for information on regular expressions are perlre (manual) and perlretut (tutorial). In this particular case, I explained this exact regular expression on Friday in the 5th bullet of Re^3: Regular Expression Help!:

    You could use anchors (^ and $) and * to do what you intend as well with $x=~/^\s*$/. This translates as requiring that between the start (^) and end ($) there is exactly 0 or more (*) whitespace characters (\s)
    .
Re: Explaining a Regular Expression
by Corion (Patriarch) on Jul 06, 2010 at 14:57 UTC
Re: Explaining a Regular Expression
by zek152 (Pilgrim) on Jul 06, 2010 at 15:05 UTC

    The regular expression most simply means that if the string stored in $data contains only 0 or more whitespace characters (between the start of a line and the end of the line) then do this. Else do that. This will match blank lines and lines that contain only whitespace.

    As mentioned you should read perlre (link found above).

Re: Explaining a Regular Expression
by marinersk (Priest) on Jul 06, 2010 at 21:32 UTC

    The answers above are very good, but what I suspect you're driving at is a strategy for reading regular expressions.

    So I'll take a stab at first summarizing, then show you how I would decipher this.

    SUMMARY: Is it blank (as seen by a human)? i.e., either has nothing in it, or has nothing but whitespace in it.

    STRATEGY:

    1) Look at the first and last characters.
    1a) ^ on left means "anchor left", meaning the match must be at the start of the string.
    1b) $ on the right means "anchor right", meaning the match must be at the end of the string.

    EXAMPLE:

    my $data = 'This is a test of the anchors.';
    if ($data =~ /test/) IS TRUE because "test" does occur somewhere in the string.
    if ($data =~ /^test/) IS FALSE because it doesn't start with "test".
    if ($data =~ /test$/) IS FALSE because it doesn't end with "test".

    2) What's left: \s*
    2a) \s means whitespace
    2b) * means 0 or more

    Hence, 0 or more whitespace characters.

    3) Put it back together.

    0 or more whitespace characters, which must start at the beginning and finish at the end. Any nonspace character anywhere in the string triggers the else clause.

    Thus, in human (and therefore terribly imprecise) terms, "Is it blank?"

    Have fun. You do get better at reading regular expressions once you've written a few. However, after all these years, I still have to shake my brain a few times to read the really convoluted ones. :: grin ::

    Edit: s/strategty/strategy/

Re: Explaining a Regular Expression
by rovf (Priest) on Jul 06, 2010 at 15:07 UTC

    It's an expression which always matches, i.e. the else branch won't be executed. UPDATE: What I have written, was non-sense. I should have said: It is an expression which matches iff the line contains only white space.

    -- 
    Ronald Fischer <ynnor@mm.st>
      $ perl -le '$_ = "blort";if (/^\s*$/) { print "always matches" } else +{ print "Orly?" }' Orly?

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

      This expression does not always match. The presence of ^ and $ impose conditions that cause it to not always match. The code below shows a simple case which does not match

      $data = "This doesn't match\n"; if ($data =~ /^\s*$/) { print "$data Matches\n"; } else { print "$data Does not match\n"; } #OUTPUT #This doesn't match # Does not match
        Sorry, this was probably the most stupid answer I have ever given in this forum. I should have looked closer - will update my node.

        -- 
        Ronald Fischer <ynnor@mm.st>