in reply to Re: maximum number of lines for negative lookahead assertion (?!)
in thread maximum number of lines for negative lookahead assertion (?!)

It simply matches a line starting with '>', followed by one or more lines that do not start with '>'. If the input had multiple lines starting with '>' -- it only has one in this program -- it would do the body of the inner while for every one of them.

Keep in mind '^' means start of line, not start of input, when /m is used.

It sounds like someone is trying to parse FASTA files.

  • Comment on Re^2: maximum number of lines for negative lookahead assertion (?!)
  • Download Code

Replies are listed 'Best First'.
Re^3: maximum number of lines for negative lookahead assertion (?!)
by johnnywang (Priest) on May 04, 2005 at 01:48 UTC
    what does the variable $line catch? there are two parathes besides the non-capturing one. From the output it's the number of lines, is that what the /g does? then why need the while?

      Keep in mind this is a simplification of his actual code, with extra code added for debugging. He followed proper PM etiquette by posting the miminum code which causes the problem, whether the code itself makes sense or not on its own.

      For example, the /g is useless in this example, since $line will always be matched in its entirety. In the real program, he would have something different in the body of the inner while.

      The captures are also nonsense. The regexp should read
      /^>(.*)\n((?:^(?!>).*\n)+)/gm
      instead of
      /^(>.*)\n(^(?!>).*\n)+/gm
      if he wishes to capture anything.

      $line is not catching anything; it is the string the match is being done on. The captures are not being used in this case; one would get the same results if the capture parentheses were replaced with (?:...)s.

      the lowliest monk

        you're right. I should have read it more carefully, but I did stare at it for a long time, somehow in my mind $line was something else, oh well.