Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

The line:

$var =~ /n\\(.?)\\/

Designed to match a string like this:

"{some text} n\{some text}\{some more text}"

Failed to find any matches, however this did:

$var =~ /n\\([^\\])/

Why? What have I missed?

Replies are listed 'Best First'.
Re: Regexpresions
by japhy (Canon) on Aug 20, 2001 at 16:23 UTC
    I think your .? was meant to be .*? instead.

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Re: Regexpresions
by Monky Python (Scribe) on Aug 20, 2001 at 16:23 UTC
    $var =~ /n\\(.?)\\/ <BR>
    matches for "n\" optionally followed by any character followed by a backslash

    $var =~ /n\\([^\\])/ <BR>
    matches for n\ not followed by backslash (only one character)
    but does also not work for : "/n\123\uu

    IMHO following should work better:

    $var =~ /n\\([^\\]+)\\/

    MP

Re: Regexpresions
by count0 (Friar) on Aug 20, 2001 at 16:31 UTC
    I won't claim to be an expert on regexps, but here's my 2 cents. First, lets go through the ones you posted.
    $var =~ /n\\(.?)\\/
    will match "{anything}n\{one or zero characters}\{anything}", and $1 will contain the "{one or zero characters}".
    The dot matches one character, the ? modifies that to match zero or one times. Because it's enclosed in parens, it'll store it.

    The second regexp you showed will match "{anything}n\{anything except \}", and $1 will contain the "{anything except \}".

    Now, to get the match you're looking for, something like:
    $var =~ /n\\(^\\]+\\/;
    which will match "{some text}n\{some more text}\{and even more}" and store "{some more text}" in $1.
    What this regexp does is looks for 'n\' followed by one or more characters that aren't a '\', followed by a '\'.

    Hope this helps. =]
Re: Regexpresions
by ozone (Friar) on Aug 20, 2001 at 16:32 UTC

    the first regex is going to look for a single character between '\' and '\', which is not quite what you want... this is because of the '?' quantifier after the '.'. '?' means one or more.

    the second one should only produce a single character - the first char of 'some text'... the '[^\\]' has an implicit quantifier of match one char only or no match at all

    you will need something like this to capture the whole string (Note the '+'):
    $var =~ /n\\([^\\]+)\\/;

      You probably were aware of this and just mistyped, but...

      The ? when used as it is in .? means zero or one, as stated by another poster above. .? does not mean one or more, that would be .+.

      A quick list:

      ? -- optional, zero or one + -- required and repeatable, one or more * -- optional and repeatable, zero or more
      All of these can be modified by placing a '?' after them, .+? in this usage however the '?' is affecting the greediness of the operator. Without the '?' it will attempt to match as many as possible (greedy), with the '?' it will attempt to match as few as required (non-greedy).

      There is a great book, Mastering Regular Expressions on this topic that is a very worthwhile read.

        D'Oh! Must be all these late nights... :-)

        Ah, thanks. The book i had suggested that .? would work in the way .+? does. Plus in the second regexpresion I ment to type:

        /\\([^\\]+)/

        The thing I was puzzeled about is perhaps better explained with the fact that this doesn't seem to work:

        /\\([^\\]+)\\/

        Maybe it was just that it was 4 in the morning at the time.

Re: Regexpresions
by Cine (Friar) on Aug 20, 2001 at 18:14 UTC
    $var =~ /n\\(.?)\\/
    This will match a 'n' followed by a '\' followed by 0 or 1 instance of any char followed by a '\'.

    Thus in the case you provide it will not match anything.
    There are two possibilities on what your want in your $1:
    either what is between two '\' or what is between the first and last '\'
    Which is either the regex $var =~ /n\\(.*?)\\/ or $var =~ /n\\(.*)\\/

    The difference may be subtle, but look at the strings:
    $var='{some text} n\{some text}\{some more text}\'; and $var='{some text} n\{some text}\{some more text}'; #Your old version
    $1 in the first $var will become '{some text}' with the first regex and '{some text}\{some more text}' with the second.
    $1 in the second $var will in either case be '{some text}'

    T I M T O W T D I