in reply to Re^6: Date Array Convolution
in thread Date Array Convolution

Cool! But you're comparing apples with oranges.

Besides which, your solution contains bugs.

Update:Dataset removed as it seems that PM cannot handle it. Despite that it appeared fine in preview.

If you want the data set which I observed produces this incorrect output (note the second last range is invalid) I can email it to you.:

C:\test>junk88 935755.dat [ ["010005", "010022", 41], ["010023", "012359", 0], ["020000", "022359", 0], ["030000", "032359", 0], ["040000", "042359", 0], ["050000", "052359", 0], ["060000", "062359", 0], ["070000", "072359", 0], ["080000", "082359", 0], ["090000", "092359", 0], [100000, 102359, 0], [110000, 112359, 0], [120000, 122359, 0], [130000, 132359, 0], [140000, 142359, 0], [150000, 152359, 0], [160000, 162359, 0], [170000, 172359, 0], [180000, 182359, 0], [190000, 192359, 0], [200000, 202359, 0], [210000, 212359, 0], [220000, 222359, 0], [230000, 232359, 0], [240000, 242359, 0], [250000, 252359, 0], [260000, 262359, 0], [270000, 272359, 0], [280000, 282359, 0], [290000, 292359, 0], [300000, 302359, 0], [310000, 312356, 0], [312357, 312356, 52], [312357, 312359, 27], ]

Instead of this outptu:

C:\test>935755 935755.dat [ ["010005", "010022", 41], ["010023", "012359", 0], ["020000", "022359", 0], ["030000", "032359", 0], ["040000", "042359", 0], ["050000", "052359", 0], ["060000", "062359", 0], ["070000", "072359", 0], ["080000", "082359", 0], ["090000", "092359", 0], [100000, 102359, 0], [110000, 112359, 0], [120000, 121942, 0], [121943, 122359, 0], [130000, 132359, 0], [140000, 142359, 0], [150000, 152359, 0], [160000, 162359, 0], [170000, 172359, 0], [180000, 182359, 0], [190000, 192359, 0], [200000, 202359, 0], [210000, 211659, 0], [211700, 211741, 0], [211742, 212359, 0], [220000, 222359, 0], [230000, 232359, 0], [240000, 242359, 0], [250000, 252359, 0], [260000, 262359, 0], [270000, 272359, 0], [280000, 282355, 0], [282356, 282359, 0], [290000, 292359, 0], [300000, 300936, 0], [300937, 302359, 0], [310000, 311315, 0], [311316, 312042, 0], [312043, 312159, 0], [312200, 312356, 0], [312357, 312359, 27], ]

Replies are listed 'Best First'.
Re^7: Date Array Convolution
by choroba (Cardinal) on Nov 08, 2011 at 00:12 UTC
    Thanks for the dataset. My code updated. The only difference between our solutions now is that my solution merges neighbouring intervals, e.g.
    [010010, 010020, 2], [010021, 010030, 2] becomes [010010, 010030, 2]
      The only difference between our solutions now is that my solution merges neighbouring intervals

      Understood. It was a concious decision not to concatenate adjacent intervals even if there values are the same, if they derived from different input intervals.

      As I mentioned in my post, I build two parallel arrays. One contains the lowest values used for each minute. The other the index of the source interval that contributed that value. In the event that two different source intervals with the same value abut, then I can have data like this:

      value: 22222111111111111 44444 source: 33333555599999999 77777

      I can either walk the source array and pick out 4 intervals -- this is what I posted above -- or I can walk the value array and pick out 3 intervals.

      The change is trivial. This

      my @res; my $i = 0; while( $i < $#id ) { ++$i until defined $id[ $i ]; my $id = $id[ $i ]; my $start = $i; ++$i while defined( $id[ $i ] ) and $id[ $i ] == $id; my $end = $i - 1; push @res, [ int2dhm( $start ), int2dhm( $end ), $tally[ $start ] +]; }

      Becomes this:

      my @res; my $i = 0; while( $i < $#tally ) { ++$i until defined $tally[ $i ]; my $val = $tally[ $i ]; my $start = $i; ++$i while defined( $tally[ $i ] ) and $tally[ $i ] == $val; my $end = $i - 1; push @res, [ int2dhm( $start ), int2dhm( $end ), $val ]; }

      It wasn't clear to me from the OP which was expected. I was expecting feedback fro the OP but it never came.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        When not concatenating intervals, the output is not uniqe. Your implementation is dependent on the order of the input intervals:
        $ echo $'100150 100210 2\n100100 100300 2\n100200 100400 2' | perl 935 +755-buk.perl [ [100100, 100149, 2], [100150, 100210, 2], [100211, 100300, 2], [100301, 100400, 2], ] $ echo $'100100 100300 2\n100200 100400 2\n100150 100210 2' | perl 935 +755-buk.perl [[100100, 100300, 2], [100301, 100400, 2]]
        But yes, it fact it depends on the purpose of the OP that is unknown.