in reply to Merge huge files (individually sorted) by order

Just missed you in the CB. File::MergeSort should do the trick for you

  • Comment on Re: Merge huge files (individually sorted) by order

Replies are listed 'Best First'.
Re^2: Merge huge files (individually sorted) by order
by tanger007 (Initiate) on Jul 19, 2013 at 00:16 UTC
    Works so well I felt more stupid :) A follow up question: if you have a big file (>10GB) in which one column has say 100 unique values. How do you break this file into 100 smaller files with one unique value in that column? Thanks so much.

      tanger007:

      Try something like putting a file handle for each column value in a hash, and then looking up the file handle on demand:

      my %OFH; my $OFH; while (<$IFH>) { my @fields = split /\t/,$_; $OFH = $OFH{$fields[$key_column]}; if (! defined $OFH) { # We don't have this value yet, so open another file open $OFH, '>', 'key_value.' . $fields[$key_column]; $OFH{$fields[$key_column]} = $OFH; } print $OFH join("\t",@fields); }

      Note: It's rough, untested and needs some error handling and such. But the basic concept should work fine for you.

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.