Merge text data by time-stamp

Quinnz has asked for the wisdom of the Perl Monks concerning the following question:

Hello, everyone,
I am a very beginner of Perl but I knew Perl will easily
solve my problem. Basically, I'd like to merge two ACSII
text data into one by time-stamp.

file1.txt (second column is time-stamp):
Feb-21,19:08:05.2,$GP,48.96,90.92,45.69
Feb-21,19:08:06.4,$GP,48.92,90.92,45.70
Feb-21,19:08:07.6,$GP,48.93,90.99,45.66
Feb-21,19:08:08.1,$GP,48.92,90.95,45.66
Feb-21,19:08:09.0,$GP,48.85,90.92,45.62
Feb-21,19:08:10.8,$GP,48.92,90.94,45.63

file2.txt(second column is time-stamp but not
corresponding to the one of file1.txt):
Feb-21,19:08:05.2,$BM,3.89
Feb-21,19:21:20.5,$BM,5.05
Feb-21,19:08:07.6,$BM,6.20
Feb-21,19:21:20.8,$BM,7.36
Feb-21,19:08:09.0,$BM,8.52
Feb-21,19:08:10.8,$BM,9.68

Output.txt will look like this:
Feb-21,19:08:05.2,$GP,48.96,90.92,45.69,19:08:05.2,$BM,3.89
Feb-21,19:08:06.4,$GP,48.92,90.92,45.70,
Feb-21,19:08:07.6,$GP,48.93,90.99,45.66,19:08:07.6,$BM,6.20
Feb-21,19:08:08.1,$GP,48.92,90.95,45.66,
Feb-21,19:08:09.0,$GP,48.85,90.92,45.62,19:08:09.0,$BM,8.52
Feb-21,19:08:10.8,$GP,48.92,90.94,45.63,19:08:10.8,$BM,9.68

Both data are of huge table file with comma delimitation.
Thank you very much for your help.

Quinn

Comment on Merge text data by time-stamp

Replies are listed 'Best First'.
Re: Merge text data by time-stamp by bart (Canon) on Feb 24, 2010 at 22:46 UTC
So here's what I think you could best do: Read both files line by line, and for each line: split into date+timestamp on one hand, and data on the other: `my($datetime, $data) = /^(\w+-\d+,[\d:.]+),(.)/;` [download] push the `$data` onto an arrayref that's the value in global hash for that `$datetime`: `push @{ $data{$datetime} }, $data;` [download] Note that autovivification makes it unnecessary to create the arrayref before you do this. OK, so you now have the data in the hash in a form directly usable for your output. One major problem is that hash keys are unordered, so you'll have to sort them. If all dates are the same, a simple alphanumerical sort will sort them by timestamp: `foreach my $key (sort keys %data) { ...` [download] You can simply join the values in the anonymous array, with a comma: `print $key, ',', join(',', @{ $data{$key} }), "\n";` [download] And that should be close! The whole program: `#! perl -w @ARGV = ('file1.txt', 'file2.txt'); my %data; while(<>) { my($datetime, $data) = /^(\w+-\d+,[\d:.]+),(.)/; push @{ $data{$datetime} }, $data; } foreach my $key (sort keys %data) { print $key, ',', join(',', @{ $data{$key} }), "\n"; }` [download] With your sample data, this produces: `Feb-21,19:08:05.2,$GP,48.96,90.92,45.69,$BM,3.89 Feb-21,19:08:06.4,$GP,48.92,90.92,45.70 Feb-21,19:08:07.6,$GP,48.93,90.99,45.66,$BM,6.20 Feb-21,19:08:08.1,$GP,48.92,90.95,45.66 Feb-21,19:08:09.0,$GP,48.85,90.92,45.62,$BM,8.52 Feb-21,19:08:10.8,$GP,48.92,90.94,45.63,$BM,9.68 Feb-21,19:21:20.5,$BM,5.05 Feb-21,19:21:20.8,$BM,7.36` [download]	[reply] [d/l] [select]
Re^2: Merge text data by time-stamp by Quinnz (Initiate) on Feb 24, 2010 at 23:02 UTC
Thank you very much for your time. It helps me a lot.	[reply]