in reply to Reg Expression on file name

I guess th error was "it doesn't work" ;-) 1

The '*' in regular expressions is a quantifier (0 or more) for whatever stood before it. It's not like '*' in shell globbing, where it stands for any number of characters.

next unless $file =~ m/^ AB \s DAT .* \.doc $/x;

A regexp is overhead here, because you only check for constant strings: 'AB DAT' (that space is probaly not going to be a newline or tab someday) and '.doc', so it's more effective to say:

next unless substr($file,0,6) eq 'AB DAT' and substr($file,-4) eq '.doc' ;

1 whenever you have a problem, describe what went wrong. In this case now it was clear, but for the future: describe the unwanted behaviour.

--
http://fruiture.de

Replies are listed 'Best First'.
Re: Re: Reg Expression on file name
by sauoq (Abbot) on Jan 06, 2003 at 19:53 UTC
    so it's more effective to say:
    next unless substr($file,0,6) eq 'AB DAT' and substr($file,-4) eq '.doc' ;

    A. That's not more effective. It's just as effective. It may be ever so slightly more efficient but the micro-optimization probably isn't really all that important.

    B. It's verbose and harder to read at a glance than a simple regular expression is. This is probably more important than whatever small optimization you might get by using substr.

    If the filename were in $_ I'd suggest next unless /^AB DAT/ and /\.doc/; which would be a good compromise.

    I just noticed your first suggestion:

    next unless $file =~ m/^ AB \s DAT .* \.doc $/x;
    I think this case in particular would really be a good time to use a literal space instead of \s and avoid /x. In general, I think /x makes short regular expressions harder to read. Do you often use /x on short expressions and if so, what's your reasoning?

    -sauoq
    "My two cents aren't worth a dime.";
    

      True, it's more readable to have an expression and the kind-of-best way is the one with two expressions. To make this quite efficient and readable, a prepost($name,$prefix,$postfix) function using substr inside would be the solution if this problem occurs more than once. This is hypothetical, it's perhaps only 30 lines of code where neither speed nor readability matter :)

      About /x: I'm starting to use it nearly everywhere, i don't think it makes anything harder to read. Perl 6 regular expressions (which are then somehow comparable to Parse::RecDescent now) are /x'ed by default, getting used to this is probably helpful. Imho it makes any expression more readable, at least after you're used to it. You see this differently, because you're probably used to compact expressions because they are common in Perl 5. Try forcing yourself to make everything very optical via /x and you'll change your mind in a week.

      --
      http://fruiture.de
Re: Re: Reg Expression on file name
by Anonymous Monk on Jan 06, 2003 at 19:49 UTC
    Thanks for all responses!