I said nearly sorted. Sorted / unsorted is not binary. This is not a randomised file it's a logfile. All entries created by 1 specific transaction are sorted within themselves.
That means for us that the number of not yet comleted series is relatively low.
| [reply] |
Yes, it's probably overkill for most situations, which is why I started off saying that hash is best for a limited number of elements.As specified in the original post however, the lines could have "Any" number of lines in between them. I didn't see anything there about events not spanning 100,000s of lines, so I didn't make assumptions that an event would complete in a timely manner. In the worst case this means you must remember some arbitrarily large number of past events until almost the entire file has been read (sorted, nearly sorted, or not).
| [reply] |