1: If you have control of the source file make it fixed-width (possibly not even newline terminated records), then seeking a particular 'line' (record) is a simple bit of math and a seek().
2: If the file must have variable length lines, and assuming you do the mid-file extraction frequently and that the file persists for a relatively long time...
Scan the file first for newlines.
Record the position of each newline in a second index file.
To look at a particular line:
Open up the index file.
Scan to the line# you're looking for.
seek() on the file you want to look at.
If the requested lines are past the end of the index:
seek() to the last known newline in the source file
index to the end (appending to your index file).
Another optimization is to store the index in binary format (ie: integers) so that you can find the index for line #27231 simply by seeking to (size_of_rec * 27231).
The algorithm in 2 can be implemented in Perl, but would be simple to do in C or C++ which would make it run pretty fast too.
Steve-
In reply to Re: part two
by smoo
in thread Simulating UNIX's "tail" in core Perl
by gryphon
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |