in reply to Possible to have regexes act on file directly (not in memory)
You basically need 2 patterns that will match the start and the end tags (or more patterns if there can be several sorts of start and end tags). Then you implement in your code a state machine or a mini-parser that looks for the start tag; when you've found one, you capture everything that comes from the file until you reach the end tag, and start allover again if this is what you need. For managing the chunk boundaries, you just need a sliding window (as already discussed) that is as large as the maximal length of the start or end tags.
Edit: @ Nocturnus: because I spent some time reading the various comments, your last message just above was not on the page I was reading when I wrote this message (in other words, I loaded the page before you posted this last message). Therefore, my message is not an answer to your very last message, but rather to the previous ones.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Possible to have regexes act on file directly (not in memory)
by Nocturnus (Scribe) on May 05, 2014 at 06:28 UTC | |
by LanX (Saint) on May 05, 2014 at 14:28 UTC | |
by Nocturnus (Scribe) on May 07, 2014 at 07:39 UTC |