I had the pleasure of being at TPC5 in San Diego, and I thought Larry gave a great State of the Onion address. I know people who have some concerns about the intended syntax changes in Perl6, but Larry sold me on every one of them.

Except maybe one: he seems intent on making the /x modifier on by default in a regular expression. I have to admit that this will encourage the use of cleaner regex code, but I'm a little concerned about how this will affect newcomers to the language.

It's not unreasonable for beginners to expect a literal space character to match the whitespace in their strings. And even though they probably should be using \s+ instead of space characters, the truth is that this makes the language a little bit harder to learn. I find this amazing, since many of the other Perl6 syntax changes seem to have the goal (or at least a side-effect) of making Perl easier to learn.

Does anyone else have an opinion on this, before it becomes set in stone?

buckaduck

Replies are listed 'Best First'.
Re: /x on regexes in Perl6?
by japhy (Canon) on Jul 31, 2001 at 01:44 UTC
    Gee, I wish I had seen you there...

    Well, I suppose I'm a good person to have answer this. By making /x the default, people will be inclined to space out (and comment!) their regexes. This is definitely a help.

    _____________________________________________________
    Jeff japhy Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      So matching a literal space will require something like /\ /? (Supposing I don't want to match a tab or a newline character, and don't want to match a character class.)

      That's rare enough, I can't see where it would be a burden on me. Might be tough to read, though, which makes the character class a better solution. Or a quick comment, which is probably the best solution.

      if ($string =~ /\ # literal space here /) { # do this or that }
Re: /x on regexes in Perl6?
by spudzeppelin (Pilgrim) on Jul 31, 2001 at 02:58 UTC

    I agree with japhy -- I wish you would have introduced yourself to some of the others of us who were there as well (not that japhy and I met each other either, but that was more logistics than intent).

    Now, about the whitespace: it may be a reasonable assumption that someone new to the language would want to see his/her whitespace interpreted literally, but doing so is almost always going to be suboptimal; I would speculate that 90% of the time or more, the new user is actually using literal whitespace when he/she really wants either \s (to match spaces AND tabs) or \b (to match a word boundary, whether it be beginning/end of line, whitespace, or punctuation). So, by forcing the initiate to be explicit about what kind of whitespace he/she is looking for, perl 6 is cutting down the debugging load and trying to impose good programming style.

    Spud Zeppelin * spud@spudzeppelin.com

Re: /x on regexes in Perl6?
by VSarkiss (Monsignor) on Jul 31, 2001 at 07:25 UTC

    I like the idea of making /x the default, and I think it may actually make learning regexps easier for a newcomer.

    I found non-literal white space strange in regexps also, but that's mainly from having used sed, grep, etc., for umpty-ump years. If you're just learning the language, I think it's a lot easier to understand something like:

    A regular expression is formed from one or more elements that are processed in sequence, much like statements and blocks are executed. You may line up these elements any way you like, since white space is not significant between them; you may use comments freely, just like the rest of your code.
    Then it doesn't seem weird to use \s+ to match whitespace. It makes it clear that regular expressions are much closer to code than to literal strings -- the only other place where white space is interpreted literally.

    When I first learned Perl REs, I breathed a sigh of relief when I read that all \ specials are alphabetic. I think being told that I could line them up any way I wanted would have been the icing on the cake.

Re: /x on regexes in Perl6?
by iakobski (Pilgrim) on Jul 31, 2001 at 13:53 UTC
    When I first came across regexes it was when I was learning Perl, and I was actully quite surprised that a space matched a literal space. Especially when there was \s to use for what I wanted. I also found "\s+" much easier to read than " +"

    -- iakobski

Re: /x on regexes in Perl6?
by buckaduck (Chaplain) on Aug 08, 2001 at 01:56 UTC
    I understand the points that were made here. Heck, I made most of those points myself in my question.

    But deep down, I have a problem with the fact the following regex won't match if the /x modifier is on by default:

    $str = 'Hello World!'; # This will NOT match in Perl6? print 'Matched' if $str =~ /Hello World/;
    I admit that new users already have to become aware of many special regex metacharacters. I just think that using the space character as a metacharacter is perhaps non-intuitive.

    I agree that the pattern /Hello\s+World/ is usually more "correct". But the pattern /Hello World/ is often perfectly usable, and it's also much easier to understand. In my humble opinion, anything which makes a simple "Hello World" example fail gives a programming language a steeper learning curve.

    buckaduck