in reply to maximum number of lines for negative lookahead assertion (?!)

Actually, can someone (OP?) comment on the regex? I'm a little shaky on how it's matching, and what the inner "while" does? thanks.
  • Comment on Re: maximum number of lines for negative lookahead assertion (?!)

Replies are listed 'Best First'.
Re^2: maximum number of lines for negative lookahead assertion (?!)
by ikegami (Patriarch) on May 03, 2005 at 22:14 UTC

    It simply matches a line starting with '>', followed by one or more lines that do not start with '>'. If the input had multiple lines starting with '>' -- it only has one in this program -- it would do the body of the inner while for every one of them.

    Keep in mind '^' means start of line, not start of input, when /m is used.

    It sounds like someone is trying to parse FASTA files.

      what does the variable $line catch? there are two parathes besides the non-capturing one. From the output it's the number of lines, is that what the /g does? then why need the while?

        Keep in mind this is a simplification of his actual code, with extra code added for debugging. He followed proper PM etiquette by posting the miminum code which causes the problem, whether the code itself makes sense or not on its own.

        For example, the /g is useless in this example, since $line will always be matched in its entirety. In the real program, he would have something different in the body of the inner while.

        The captures are also nonsense. The regexp should read
        /^>(.*)\n((?:^(?!>).*\n)+)/gm
        instead of
        /^(>.*)\n(^(?!>).*\n)+/gm
        if he wishes to capture anything.

        $line is not catching anything; it is the string the match is being done on. The captures are not being used in this case; one would get the same results if the capture parentheses were replaced with (?:...)s.

        the lowliest monk