in reply to Re^2: Joining separate data files to make one.
in thread Joining separate data files to make one.
. There are two typos ( [ should be { ).
Sorry. It was typed directly into the edit box and so was never tested. I apologise for that. I wanted to describe a viable alternative approach to the problem--and I find describing with code far more efficient and clear than using words. I was aware that it wasn't a complete working solution as posted.
The only drawback with your script is that if one of the files ends before a later file, the "n/a" is not appended to the hash for the file before.
I would handle that in the output loop. If when you come to write a record, it is "too short", pad it with the appropriate numbers of 'n/a's. Of course, as coded with concatenating strings, determining how much to add is a pain.
You could split "\t" to get the field count, and the padding and the rejoin, but that would be a bit silly. Better to build up the records as (a hash of) arrays, pushing the fields as you go, and then just join them at the end. After padding if necessary.
Something like:
my %data; open FILE, '<', 'gravity' or die; while( <FILE> ) { my @fields = split ' ', $_; $data{ @fields[ 0, 1 ] } = \@fields; } close FILE; open FILE, '<', 'magnetics' or die; while( <FILE> ) { my @fields = split ' ', $_; ## Pad the hash if we didn't see this date/time in the gravity fil +e $data{ "@fields[ 0, 1 ]" } //= [ @fields[ 0,1 ], ('n/a') x 3 ]; push @{ $data{ "@fields[ 0, 1 ]" } }, @fields[ 2 .. $#fields ]; } close FILE; open FILE, '<', 'bathymetry' or die; while( <FILE> ) { my @fields = split ' ', $_; ## Pad the hash if we've never seen it before) ## (??? == No of fields added by the magnetics) $data{ "@fields[ 0, 1 ]" } //= [ @fields[ 0,1 ], ('n/a') x ( 3 + ? +?? ) ]; ## We saw it in gravity, but not magnetics. push @{ $data{ "@fields[ 0, 1 ]" } }, ('n/a') x ??? if @{ $data{ "@fields[ 0, 1 ]" } } < 3 + ???; push @{ $data{ "@fields[ 0, 1 ]" } }, @fields[ 2 .. $#fields ]; } close FILE; for my $key ( sort keys %data ) { my $nFields = @{ $data{ $key } }; ## Pad: ??? === total number of fields push @{ $data{ $key } }, ('n/a') x ( ??? - $nFields ); print join "\t", @{ $data{ $key } }; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Joining separate data files to make one.
by msexton (Initiate) on Oct 08, 2010 at 11:44 UTC | |
by BrowserUk (Patriarch) on Oct 08, 2010 at 12:48 UTC | |
by choroba (Cardinal) on Oct 08, 2010 at 11:58 UTC |