in reply to matching datetimestamps and concatenating data where timestamps match from multiple large datafiles
Keeping all this data in memory will not work unless you have a computer with a huge memory to avoid the repeatedly swapping in and out of your data and even then you still have to write a program to efficiently search through all those arrays.
I would do it as follows:
Alternatively but only if all the files have timestamps in strict time-sequential order, you can open a filehandle to each of the files and on a line-by-line basis, loop through FILEA, extract the timestamp, transform the timestamp into a standard format and then iterate through all other files checking their timestamps (after transforming those also into the same standard format) and output the record when the timestamps match until you hit a timestamp past the timestamp of the main loop. Then you do the same for the next FILE until all FILEs have passed the timestamp of the main file and you go to the next record in the main file.
CountZero
A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
My blog: Imperial Deltronics
|
|---|