Is seeking backwards in a piped stream going to work (reliably)?
This seems like a basic map-reduce paradigm. It would be nice to have a method that would "just work(tm)" for pre sorted data like this.
Something like an iterator over the group of file handles that stores a one line buffer. Hadoop::Streaming::Reducer::Input does something like this for a single filehandle, but its terrible clumsy code (sorry 'bout that).