The more intuitive thing to me would be to, when a filter is turned off, the filtered lines become available when the user scrolls to where the would-have-been-filtered lines are. The only change to the GUI is that you display filtered lines that now appear, keeping the center line still in the center. The way to do this is to have two buffers. This is going to be complicated, so bear with me.
Let's say the GUI displays 21 lines - the line requested and 10 lines on either side. Your user has filters A and B turned on, filter C turned off, and asks to see line 115. So, you have to display lines 104-125, but showing only unfiltered lines. So, you pull in a line. $real_count increments. You try each filter in turn, even the ones that are turned off. You then mark which filters would hide this line. You also store the tell() for this line. (This will be used later.) Based on whether any active filter would hide the line, you increment $display_count. Keep doing this until you reach actual line 104. From here until display-able line 125, you keep doing the same thing, but you also add the actual line to the buffer of lines.
Now, you're going to have in memory two arrays: the first is a list of actual line numbers, which filters remove them and the tell() for the beginning of the line, starting from actual line #1 going to display-able line #125. You also have a buffer of lines from the file from actual line #104 going to display-able line #125.
To display, you go backwards through the first array, pulling out the first 21 actual line#'s that pass the filters. You then go into the second array, subtracting 104 from the indices, and get the actual data line. Put those in the buffer.
Now, when a filter is turned on or off, you just repeat the "To display" part - you already have all the data you need.
If scrolling happens, you have to determine if they're scrolling upwards or downards. If they're going upwards, you just have to grab the data starting at the top of the buffer (using the tell() data you stored before) and reading in the data from the file. You don't need to determine if the lines are buffered or not because you already know this. If scrolling happens downards, you're going to have to build up the information about each line.
Now, you can use some sort of tied-hash-to-DBM type of solution to keep yourself from running out of RAM, but at a small-to-medium cost to performance.. Also, you can pre-process this information if the files and filters are relatively static, thereby incurring only the cost of the tied-hash.
As for adding filters at runtime . . . users are just going to have to know that adding a new filter is going to be expensive - there's just no way around it. But, this algorithm should provide a lot of performance boosts.
Being right, does not endow the right to be rude; politeness costs nothing.
Being unknowing, is not the same as being stupid.
Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.
In reply to Re: Further on buffering huge text files
by dragonchild
in thread Further on buffering huge text files
by spurperl
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |