in reply to Extract Multiple Lines from the End of a multi-line string

How about
sub Last_N_Lines { my($String, $N) = @_; my $re = '^' . ( '.*\\n' x ($N-1) ) . '(?:.*\\n|.+)'; return ( $String =~ /($re)\z/ )[0]; }

or

sub Last_N_Lines { my($String, $N) = @_; --$N; return ( $String =~ /^( (?:.*\n){$N} (?:.*\n|.+) )\z/mx )[0]; }

Update: Added alternative.
Update: Added leading anchor to reduce backtracking.

Replies are listed 'Best First'.
Re^2: Extract Multiple Lines from the End of a multi-line string
by NateTut (Deacon) on Oct 16, 2008 at 21:14 UTC
    It works great but I'm not sure I understand completely.
    '^' . ( '.*\\n' x ($N-1) ) .
    This matches the penultimate lines.
    '(?:.+|.*\\n)'
    I'm a little fuzzier on this bit. ?: is for a non-captured group I think, the .*\\n must be the last line especially when used with the \z but what is the +| for? and also the [0] on the end of the return?

      I defined a line as either "zero or more non-newline characters followed by a newline (/.*\n/)" or "one or more non-newline characters (/.+/)". That way,

      • "foo\nbar\n" is considered to have two lines ("foo\n" and "bar\n")
      • "foo\nbar" is considered to have two lines ("foo\n" and "bar")

      That's the same behaviour as <FH>.

      If I had used /.*\n?/ instead of /.*\n|.+/, "foo\nbar\n" would have been considered to have three lines ("foo\n", "bar\n" and ""). Always be wary of patterns that can match zero characters.

      Oops, missed the second question.

      and also the [0] on the end of the return?

      In list context, the match operator returns what it captured. I used a list slice to force list context.

      $x = 'abc' =~ /a(.)c/; # 1 (or false on fail) @x = 'abc' =~ /a(.)c/; # b (or () on fail) $x = ( 'abc' =~ /a(.)c/ )[0]; # b (or undef on fail)
      I could also have used
      $x = 'abc' =~ /a(.)c/ && $1; # b (or false on fail)
        Thanks for the explanation. I was trying to do something like (.*\n){5} and was getting nowhere.