use re 'debug'; my $sequence_to_parse =">test\nATG\nGGG"; while ($sequence_to_parse=~/^>.*\n(^(?!>).*$)+/gm) {print "$&\n";} __END__ Compiling REx "^>.*\n(^(?!>).*$)+" Final program: 1: MBOL (2) 2: EXACT <>> (4) 4: STAR (6) 5: REG_ANY (0) 6: EXACT <\n> (8) 8: CURLYX[0] {1,32767} (25) 10: OPEN1 (12) 12: MBOL (13) 13: UNLESSM[0] (19) 15: EXACT <>> (17) 17: SUCCEED (0) 18: TAIL (19) 19: STAR (21) 20: REG_ANY (0) 21: MEOL (22) 22: CLOSE1 (24) 24: WHILEM[1/1] (0) 25: NOTHING (26) 26: END (0) anchored ">" at 0 floating "%n" at 1..2147483647 (checking floating) anchored(MBOL) minlen 2 Guessing start of match in sv for REx "^>.*\n(^(?!>).*$)+" against ">test%nATG%nGGG" Found floating substr "%n" at offset 5... Found anchored substr ">" at offset 0... Position at offset 0 does not contradict /^/m... Guessed: match at offset 0 Matching REx "^>.*\n(^(?!>).*$)+" against ">test%nATG%nGGG" 0 <> <>test%nATG> | 1:MBOL(2) 0 <> <>test%nATG> | 2:EXACT <>>(4) 1 <>> | 4:STAR(6) REG_ANY can match 4 times out of 2147483647... 5 <>test> <%nATG%nGGG> | 6: EXACT <\n>(8) 6 | 8: CURLYX[0] {1,32767}(25) 6 | 24: WHILEM[1/1](0) whilem: matched 0 out of 1..32767 6 | 10: OPEN1(12) 6 | 12: MBOL(13) 6 | 13: UNLESSM[0](19) 6 | 15: EXACT <>>(17) failed... 6 | 19: STAR(21) REG_ANY can match 3 times out of 2147483647... 9 <%nGGG> | 21: MEOL(22) 9 <%nGGG> | 22: CLOSE1(24) 9 <%nGGG> | 24: WHILEM[1/1](0) whilem: matched 1 out of 1..32767 9 <%nGGG> | 10: OPEN1(12) 9 <%nGGG> | 12: MBOL(13) failed... whilem: failed, trying continuation... 9 <%nGGG> | 25: NOTHING(26) 9 <%nGGG> | 26: END(0) Match successful! >test ATG Guessing start of match in sv for REx "^>.*\n(^(?!>).*$)+" against "%nGGG" Did not find floating substr "%n"... Match rejected by optimizer Freeing REx: "^>.*\n(^(?!>).*$)+" #### use YAPE::Regex::Explain; print YAPE::Regex::Explain->new( qr/^>.*\n(^(?!>).*$)+/m )->explain; __END__ The regular expression: (?m-isx:^>.*\n(^(?!>).*$)+) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?m-isx: group, but do not capture (with ^ and $ matching start and end of line) (case- sensitive) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of a "line" ---------------------------------------------------------------------- > '>' ---------------------------------------------------------------------- .* any character except \n (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \n '\n' (newline) ---------------------------------------------------------------------- ( group and capture to \1 (1 or more times (matching the most amount possible)): ---------------------------------------------------------------------- ^ the beginning of a "line" ---------------------------------------------------------------------- (?! look ahead to see if there is not: ---------------------------------------------------------------------- > '>' ---------------------------------------------------------------------- ) end of look-ahead ---------------------------------------------------------------------- .* any character except \n (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- $ before an optional \n, and the end of a "line" ---------------------------------------------------------------------- )+ end of \1 (NOTE: because you are using a quantifier on this capture, only the LAST repetition of the captured pattern will be stored in \1) ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- #### my $sequence_to_parse =">test\nATG\nGGG"; while ( $sequence_to_parse =~ m/(^>.+)|(^.+)/gm ) { if ( defined $1 ) { print "got first line \$1 ($1)\n"; } elsif ( defined $2 ) { print "got other line \$2 ($2)\n"; } else { print "UH OH \n"; } } __END__ got first line $1 (>test) got other line $2 (ATG) got other line $2 (GGG)