Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hmm.
if ($line =~ /(.*?)\s*?(.*?$)/) { $grp = $1; $dsc = $2; }
$line may or may not have anything following it, or it may have one or more whitespace characters after matching $1.

What's wrong with this? Here's some sample data:

alt.philosophy.jarf The Jarf philosphy/metaphysics/religion/cultur +e. alt.philosophy.objectivism A product of the Ayn Rand corporation. alt.philosophy.zen Meditating on how the alt.* namespace works. microsoft.public.win95.setup microsoft.public.win95.shellui microsoft.public.win95.win95applets
The alt.* entries are matched, however the microsoft.* are not.

Edit: chipmunk 2002-01-09

Replies are listed 'Best First'.
Re: regex problem
by japhy (Canon) on Jan 10, 2002 at 00:26 UTC
    It matches for me, but not as you want it to. Your regex is too relaxed. The (.*?) tries to match zero characters first (and succeeds). The \s*? tries to match zero whitespace (and succeeds). The (.*?$) tries matching zero characters (and fails), and eventually ends up matching the whole line. Boo.

    You want something like /(.*?)\s+(.*)/, or perhaps /(\S+)\s+(.*)/. Better yet, use split(). my ($ng, $desc) = split ' ', $record, 2;

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      my ($ng, $desc) = split ' ', $record, 2;

      did the trick with:

       $desc = "no description available" unless ($desc);

      Thanks for your help!!