Re^2: More efficient way to exclude footers

Hello, babysFirstPerl.
There is another way: using regular expressions. If I am correct this program has to go over input only once and it is't slow? -

use strict;
use warnings;

my $header_lines = $ARGV[0] // 0;
my $footer_lines = $ARGV[1] // 0;

my $whole_input;
# slurp whole file into one scalar variable
{local $/ ; $whole_input = <DATA>};
# (this can exceed memory if data is too much)

# define what line is in regular expression language:
# not newline x (zero or more times) + one newline after
my $line_regex = qr/[^\n]*\n/;

# treat whole input as string and substitute lines with empty strings:
$whole_input =~ s/\A (?:$line_regex){$header_lines}   //x; 
                  # delete some lines from the beginning
$whole_input =~ s/   (?:$line_regex){$footer_lines} \z//x;
                  # delete some lines from the ending

print $whole_input; # now it is not whole, and you can parse

__DATA__
Header 1
Header 2
Text 1
Text 2
Text 3
Text 4
Text 5
Footer 1
Footer 2
Footer 3
[download]

But if the last line of the file ends not with newline, second regex do not match and don't delete anything.

Comment on Re^2: More efficient way to exclude footers Download Code

Replies are listed 'Best First'.
Re^3: More efficient way to exclude footers by AnomalousMonk (Archbishop) on Aug 19, 2015 at 17:51 UTC
But if the last line of the file ends not with newline, second regex do not match and don't delete anything. That can easily be fixed by changing the regex object definition `my $line_regex = qr/[^\n]\n/;` to `my $line_regex = qr/[^\n]\n?/;` (note final `\n` has `?` quantifier added). (Tested.) But you need to go one step further in the example: show extraction of each remaining line for further processing. Update: And see also File::Slurp. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^4: More efficient way to exclude footers by rsFalse (Chaplain) on Aug 19, 2015 at 18:09 UTC
`my $line_regex = qr/[^\n]\n?/;` Thanks :) . Hm. And if we have some nonsense input: N and M, with file having less than N + M lines, then this regex deletes all lines. But earlier regex (without question mark) fails to delete too much lines. In that case we can set lower limit to greedy quantifier (add "0,"): `s/\A (?:$line_regex){0, $header_lines} //x;` >>But you need to go one step further in the example: show extraction of each remaining line for further processing.* `chomp $whole_input; parse( $_ ) for split /\n/, $whole_input;` [download] But now it takes time for split :/ Upd.: these lines after split are without newlines. If that is important for parsing to have newlines, the first split parameter could be `/^/m`	[reply] [d/l] [select]