in reply to Tabulating Data Across Multiple Large Files

Yes to your last question.

As for your task - I'm not sure what to tell you, given that you show us neither sample code nor data. Is file 0 the only file you're looking in for what data to look for? Ie is it your "control"/config file? If so, it shouldn't need anything but a doubly nested while loop (in a naive approach anyway).

Getting more fancy, you would probably load the data from the control file into a hash and use it as reference, while linearly processing the rest of your files in a sinlge loop.

Makeshifts last the longest.

  • Comment on Re: Tabulating Data Across Multiple Large Files

Replies are listed 'Best First'.
Re: Re: Tabulating Data Across Multiple Large Files
by reds (Novice) on Mar 22, 2003 at 21:56 UTC
    The data looks like this:
    1,Case,Iter,Fusion,Type,Tanks,AFVs,ADAs,IFVs,UAVS,Unknown,Total, Latency, Decoys
    ,FalseNeg, FalsePos
    32,A2,1,UE_Battle_Bde,TRUTH,0,0,0,0,0,0,0,0,0
    32,A2,1,UE_Battle_Bde,PERCEIVED,0,0,0,0,0,0,0,0,0,0,0
    32,A2,1,UE_Battle_Bde,FREQUENCIES,0,0,0,0,0
    32,A2,1,UA1,TRUTH,0,0,0,0,0,0,0,0,0
    32,A2,1,UA1,PERCEIVED,0,0,0,0,0,0,0,0,0,0,0
    32,A2,1,UA1,FREQUENCIES,0,0,0,0,0
    35,A2,1,UE_Battle_Bde,TRUTH,0,0,0,0,0,0,0,0,0
    35,A2,1,UE_Battle_Bde,PERCEIVED,0,0,0,0,0,0,0,0,0,0,0
    35,A2,1,UE_Battle_Bde,FREQUENCIES,0,0,0,0,0
    

    Within the time steps (32 and 35 shown here), I am matching rows that share the same columns titled Case, Fusion and Type. Then, within each line that has the same Time, Case, Fusion and Type I am averaging all the other columns (0s here).

      Are the files guaranteed to be sorted by time steps? That would make things pretty easy. Also, the question about what function that file 0 has still stands.

      Makeshifts last the longest.

        As to the file 0 question: No. File 0 probably has most of what I'm looking to match in files 1, 2 ... N - but that's not guaranteed. For example, file 0 may not have a time step at 37, but file 34 will.

        As to the sort question: Yes. Each file is sorted by time step.