in reply to Key-based diffs

Can you give us some sample source files to test the algorithm on? It may take some time to go through this otherwise.

Replies are listed 'Best First'.
Re^2: Key-based diffs
by haroldo (Acolyte) on Sep 20, 2004 at 06:26 UTC
    Sure. Consider: ./00000001/lines.txt
    1017,0,984,20030115,"18:07:54",20030301,191009,5975,3,0.00,0.00,27 1018,0,985,20030115,"18:09:19",20030301,191010,5311,3,0.00,0.00,27 1019,0,986,20030115,"18:10:49",20030301,191011,6370,3,0.00,0.00,27 1020,0,987,20030115,"18:13:32",20030301,191012,1022,3,0.00,0.00,30 1022,0,989,20030115,"18:18:28",20030301,191014,6618,3,0.00,0.00,27 1023,0,990,20030115,"18:19:59",20030301,191015,5081,3,0.00,0.00,27 1024,0,991,20030115,"18:21:08",20030301,191016,5763,3,0.00,0.00,27 1025,0,992,20030115,"18:21:39",20030301,191017,5650,3,0.00,0.00,30
    ./00000002/lines.txt
    1019,0,986,20030115,"18:10:49",20030301,191011,6370,3,0.00,0.00,27 1020,0,987,20030115,"18:13:32",20030301,191012,1022,3,0.00,0.00,30 1021,0,988,20030115,"18:18:09",20030301,191013,7191,3,0.00,0.00,30 1022,0,989,20030115,"18:18:28",20030301,191014,6618,3,0.00,0.00,27 1023,0,990,20030115,"18:19:59",20030301,191015,5081,3,0.00,0.00,27 1024,0,991,20030115,"18:21:08",20030301,191016,5763,3,0.00,0.00,27 1026,0,993,20030115,"18:21:47",20030301,191018,1125,3,0.00,0.00,27
    The invoking line would look like this:
    $ ./missing.pl lines 1 2
    (i've already noted bad output from the first "missing.pl" script, gosh!)
      If it's supposed to output two files, what are the other two for? Also, you could make the code significantly simpler by loading both files at the start and removing the bad lines - do you expect to be handling huge numbers of records, or is this a possibility? I can't really get more specific without writing my own version of this script, and that's going to take a bit of time. Maybe after I get some sleep.
        My input is really large -- not suitable to memory. Then, I have to inline validation within output analisys. Every programmer has the Right to sleep. Thankyou.