This seems like a basic map-reduce paradigm. It would be nice to have a method that would "just work(tm)" for pre sorted data like this.
Something like an iterator over the group of file handles that stores a one line buffer. Hadoop::Streaming::Reducer::Input does something like this for a single filehandle, but its terrible clumsy code (sorry 'bout that).
Hadoop::Streaming::Reducer::Input source.
In reply to Re^2: Seeking in a file
by spazm
in thread Seeking in a file
by Ineffectual
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |