Re: splitting files by number of words

If you need the words to remain in the same order, and teh file is too big for ememory, then you'll need two passes al la moritz suggestion.

However, if which file each word ends up in doesn't matter, then you can open 10 output files and process the file in one pass, by writing to each output file alternately:

#! perl -slw
use strict;
use constant NFILES => 10;

my @fhs;
open $fhs[ $_ ], '>', 'output.' . $_
    or die $!
    for 0 .. NFILES - 1;

my $iFhs = 0;

while( <> ) {
    for my $word ( split '\W+' ) {
        print { $fhs[ $iFhs ] } $word;
        ++$iFhs;
        $iFhs %= NFILES ;
    }
}

close $_ for @fhs;
[download]

You might need to change the regex to match your definition of words.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

RIP PCW

Comment on Re: splitting files by number of words Download Code