in reply to Merging Many Files Side by Side

You could save some memory consumption (and perhaps reduce slowdown due to memory paging) if you opened the files one by one. By opening all the files at once, you are also opening 200 buffers for each of the files all at once.

  1. Create an array of arrays.
  2. Read the first file one line at a time, and place all of column 2's values in $aa[0];
  3. Read the second file one line at a time, and place all of column 2's values in $aa[1];
  4. And so on...
  5. Loop through array and print to file

With files that large, you might also want to consider tying the arrays holding the columns to random access files. I once had a ten-fold speed up (in a C++ program) just by using temp files instead of RAM to store data while I was processing. See Tie::File

If your columns are fixed width, you might be able to to avoid two cycles, one to read the array and one to print the array, by keeping a variable that stores the current line length. Then instead of an array of arrays, you could just:

  1. Read the first file
    1. Write the first file by placing each col 2 value on a separate line.
    2. Increase the line length by one column width
  2. Read the next file
    1. Seek to the end of the line, insert the column
    2. Seek to end of the column you just wrote
    3. Seek to end of next line, insert the column
    4. After all lines have been read, increase line length by one column width
  3. Repeat until all files are processed

Best, beth