The program opens several files, and is to grab the text between header/footer tags (they're not html) including the header/footer. I thought that using the range operator would be perfect but that's when I got hit by the "integrity" factor. Apparently it is not uncommon for a footer to be left off when it is the last section in a file. So much for elegance.
My question is, is there a clever/elegant solution to this problem (i.e., slight modification to the range statement) or will it require a more brute force approach? I'm not asking that anyone rewrite the entire program to make it work, I can do that myself. It seems to me that this must be a common problem and that the use of the range operator as I've done is too fragile for any but the most controlled circumstances.
file1.txt:FILE2 AAA text1 text2 text3 EOAAA BBB text4 EOBBB CCC text5 text6 text7 text8 EOCCC
file3.txt:FILE1 SEGMENT1 text1 text2 text3 EOS1 SEGMENT2 text4 EOS2 SEGMENT3 text5 text6 text7 text8
Program:FILE3 P201 text1 text2 text3 EOP201 P333 text4 EOP333 P588 text5 text6 text7 text8 EOP588
Output:use strict; my @jobs = ( 'file2.txt|AAA|EOAAA', 'file1.txt|SEGMENT3|EOS3', 'file3.txt|P333|EOP333' ); for (@jobs) { my ($file, $beg, $end) = split /\|/; my $first_line = 1; my @lines = (); print "Opening file: '$file'\n", " beg: '$beg' end: '$end'\n"; open (INFILE, "<$file") or die "Could not open '$file': $!\n"; while (<INFILE>) { if (/$beg/ .. /$end/) { chomp; print " first: '$_'\n\n" if $first_line; $first_line = 0; push (@lines, $_); } } print "$_\n" for @lines; print "-" x 30, "\n\n"; undef @lines; }
I tried what seemed like reasonable approaches, but none of them worked, such as the following and variations thereof: if ((/$beg/ .. /$end/) or (/$beg/ .. eof())) { Thanks much,Opening file: 'file2.txt' beg: 'AAA' end: 'EOAAA' first: 'AAA' AAA text1 text2 text3 EOAAA ------------------------------ Opening file: 'file1.txt' beg: 'SEGMENT3' end: 'EOS3' first: 'SEGMENT3' SEGMENT3 text5 text6 text7 text8 ------------------------------ Opening file: 'file3.txt' beg: 'P333' end: 'EOP333' first: 'FILE3' FILE3 P201 text1 text2 text3 EOP201 P333 text4 EOP333 ------------------------------
--Jim
In reply to Between-text range operator problem by jlongino
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |