I think this approach is ok for small set of data, and gets inefficient when the input data is in the millions of lines, your algorithm requires the entire file to be read in memory (what if 200million lines?)
To be more efficient, you should be looking for an algorithm that has a smaller/predictable memory footprint, reading the entire data file into memory is not an ideal option.
I was coding to the original spec which clearly stated that he had N
arrays, not a text file or a database or some other datastore from which to
draw the data in less memory consumptive chunks! I only used reading in
from DATA to create the arrays and satisfy the precondition of the problem
statement.