in reply to splitting files by number of words
If you need the words to remain in the same order, and teh file is too big for ememory, then you'll need two passes al la moritz suggestion.
However, if which file each word ends up in doesn't matter, then you can open 10 output files and process the file in one pass, by writing to each output file alternately:
#! perl -slw use strict; use constant NFILES => 10; my @fhs; open $fhs[ $_ ], '>', 'output.' . $_ or die $! for 0 .. NFILES - 1; my $iFhs = 0; while( <> ) { for my $word ( split '\W+' ) { print { $fhs[ $iFhs ] } $word; ++$iFhs; $iFhs %= NFILES ; } } close $_ for @fhs;
You might need to change the regex to match your definition of words.
|
|---|