in reply to Split up file depending unique values 1st column
If there are fewer than a couple dozen, then you can just open that many output files and print each line to one of those files as you read through the input.
If there are lots of different values/output files, then it would be better to sort the input file first, so that distinct values in the first column are clumped together, and you can open and close output files as you go through the input, and you only need one output file open at any given time.
Since the latter approach works equally well for all cases, that's the one I'd rather go with. It assumes that you have a decent utility to sort the input before feeding it to your perl script (e.g. unix/GNU "sort"):
(not tested; updated to remove the unnecessary $col variable, and to use ".+" instead of ".*" in the regex match to capture the first-column/file-name string -- don't want an empty string there.)use strict; use warnings; open( INPUT, "sort $ARGV[0] |" ) or die "can't sort $ARGV[0]: $!"; my $outname = ""; while (<INPUT>) { if ( /^(.+?),/ ) { my $newname = $1; if ( $newname ne $outname ) { close OUT if ( $outname ); open( OUT, ">$newname" ) or die "can't output to $newname: +$!"; $outname = $newname; } print OUT; } else { warn "Sorted input from $ARGV[0] had unusable data at line $.: + $_\n" } } close OUT;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Split up file depending unique values 1st column
by GertMT (Hermit) on Dec 17, 2006 at 13:07 UTC |