in reply to Re^2: Extract Multiple Lines from the End of a multi-line string
in thread Extract Multiple Lines from the End of a multi-line string

I tend to go for ultra portability.. and qr/(?:\r\n|\n\r|\n|\r)/s will likely handle any end-of-line scenario in the modern world. The fact that the two byte sequences (like \r\n) are tested before the single byte sequences (like \n) ensures that the correct behaviour will take place.

Replies are listed 'Best First'.
Re^4: Extract Multiple Lines from the End of a multi-line string
by ikegami (Patriarch) on Oct 19, 2008 at 17:01 UTC

    I tend to go for ultra portability..

    I've already said it doesn't help portability.
    On unix, the end of line is \n
    On Windows, the end of line is \n
    On old Macs, the end of line is \n
    On new Macs, the end of line is \n

    qr/(?:\r\n|\n\r|\n|\r)/s will likely handle any end-of-line scenario in the modern world.

    I didn't deny that. I said it should be centralized. Compare

    my $re_single = qr/ (?:\A|[\r\n]) # either the start of text or newline ( [^\r\n]* # any non-newlines (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last newline or eof ) /xs; my $re_multiple = qr/ (?:\A|[\r\n]) # either the start of text or newline ( (?: [^\r\n]* (?:\r\n|\n\r|\r|\n) ){0,$nless} ) # have to enclose above so it all goes into $1 ( [^\r\n]* (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last nl or eof ) /xs;

    to

    my $re_single = qr/ ^ ( .* \n? \z ) /mx; my $re_multiple = qr/ ^ ( (?: .* \n ){0,$nless} ) ( .* \n? \z ) /mx; $String =~ s/\r\n|\n\r|\r|\n/\n/g;

    In short, it's not Last_N_Lines's job to decode IO.


    By the way, there are more improvements you can make.

    • $re_single is just a special case of $re_multiple. You can use $re_multiple even when only one line is needed.
    • You don't need two captures. Combine them into one.
    • This is the perfect place for the ternary operator.

    Compare

    my $re_single = qr/ (?:\A|[\r\n]) # either the start of text or newline ( [^\r\n]* # any non-newlines (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last newline or eof ) /xs; sub Last_N_Lines { my ( $str, $n ) = @_; my $nless = ( $n - 1 ); my $re_multiple = qr/ (?:\A|[\r\n]) # either the start of text or newline ( (?: [^\r\n]* (?:\r\n|\n\r|\r|\n) ){0,$nless} ) # have to enclose above so it all goes into $1 ( [^\r\n]* (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last nl or eof ) /xs; if ( $n > 1 ) { if ( $str =~ m/$re_multiple/ ) { return( "$1$2" ); } else { return( "" ); } } if ( $str =~ m/$re_single/ ) { return( $1 ); } else { return( "" ); } }

    to

    sub Last_N_Lines { my ( $str, $n ) = @_; my $nless = ( $n - 1 ); return $str =~ / ^ ( (?: .* \n ){0,$nless} .* \n? ) \z /mx ? $1 : ""; } $String =~ s/\r\n|\n\r|\r|\n/\n/g;