in reply to Re^7: Date Array Convolution
in thread Date Array Convolution

The only difference between our solutions now is that my solution merges neighbouring intervals

Understood. It was a concious decision not to concatenate adjacent intervals even if there values are the same, if they derived from different input intervals.

As I mentioned in my post, I build two parallel arrays. One contains the lowest values used for each minute. The other the index of the source interval that contributed that value. In the event that two different source intervals with the same value abut, then I can have data like this:

value: 22222111111111111 44444 source: 33333555599999999 77777

I can either walk the source array and pick out 4 intervals -- this is what I posted above -- or I can walk the value array and pick out 3 intervals.

The change is trivial. This

my @res; my $i = 0; while( $i < $#id ) { ++$i until defined $id[ $i ]; my $id = $id[ $i ]; my $start = $i; ++$i while defined( $id[ $i ] ) and $id[ $i ] == $id; my $end = $i - 1; push @res, [ int2dhm( $start ), int2dhm( $end ), $tally[ $start ] +]; }

Becomes this:

my @res; my $i = 0; while( $i < $#tally ) { ++$i until defined $tally[ $i ]; my $val = $tally[ $i ]; my $start = $i; ++$i while defined( $tally[ $i ] ) and $tally[ $i ] == $val; my $end = $i - 1; push @res, [ int2dhm( $start ), int2dhm( $end ), $val ]; }

It wasn't clear to me from the OP which was expected. I was expecting feedback fro the OP but it never came.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^9: Date Array Convolution
by choroba (Cardinal) on Nov 08, 2011 at 06:43 UTC
    When not concatenating intervals, the output is not uniqe. Your implementation is dependent on the order of the input intervals:
    $ echo $'100150 100210 2\n100100 100300 2\n100200 100400 2' | perl 935 +755-buk.perl [ [100100, 100149, 2], [100150, 100210, 2], [100211, 100300, 2], [100301, 100400, 2], ] $ echo $'100100 100300 2\n100200 100400 2\n100150 100210 2' | perl 935 +755-buk.perl [[100100, 100300, 2], [100301, 100400, 2]]
    But yes, it fact it depends on the purpose of the OP that is unknown.

      Okay. Now we have the OPs clarification, the output from my code consolidating equal valued intervals is:

      C:\test>935755 935755.dat [ ["010005", "010022", 41], ["010023", "012359", 0], ["020000", "022359", 0], ["030000", "032359", 0], ["040000", "042359", 0], ["050000", "052359", 0], ["060000", "062359", 0], ["070000", "072359", 0], ["080000", "082359", 0], ["090000", "092359", 0], [100000, 102359, 0], [110000, 112359, 0], [120000, 122359, 0], [130000, 132359, 0], [140000, 142359, 0], [150000, 152359, 0], [160000, 162359, 0], [170000, 172359, 0], [180000, 182359, 0], [190000, 192359, 0], [200000, 202359, 0], [210000, 212359, 0], [220000, 222359, 0], [230000, 232359, 0], [240000, 242359, 0], [250000, 252359, 0], [260000, 262359, 0], [270000, 272359, 0], [280000, 282359, 0], [290000, 292359, 0], [300000, 302359, 0], [310000, 312356, 0], [312357, 312359, 27], ]

      Do you concur?

      The (now simplified) code for this is:

      #! perl -slw use strict; use Data::Dump qw[ pp ]; sub dhm2int { my( $d, $h, $m ) = unpack '(A2)*', $_[0]; return ( ( $d - 1 ) * 24 + $h ) * 60 + $m; } sub int2dhm { sprintf "%02d%02d%02d", int($_[0]/1440)+1, int($_[0]/60)%24, $_[0] +%60; } #my @data = ( # ['010000','010010',2],['010200','010210',5],['012359','020001',3] +, # ['010005','010015',1],['010207','010211',4] #); my @data = map[ split ], <>; my @expd = map { my $in = $_; my $s = dhm2int( $in->[ 0 ] ); my $e = dhm2int( $in->[ 1 ] ); my @out; while( int( $s / 1440 ) != int( $e / 1440 ) ) { my $newe = ( int( $s / 1440 ) +1 ) * 1440 -1; push @out, [ $s, $newe, $in->[2] ]; $s = $newe +1; } ( @out, [ $s, $e, $in->[2] ] ); } @data; my( @tally ); for my $e ( 0 .. $#expd ) { my $r = $expd[ $e ]; for my $i ( $r->[ 0 ] .. $r->[ 1 ] ) { if( !defined( $tally[ $i ] ) or $tally[ $i ] > $r->[ 2 ] ) { $tally[ $i ] = $r->[ 2 ]; } } } my @res; my $i = 0; while( $i < $#tally ) { ++$i until defined $tally[ $i ]; my $val = $tally[ $i ]; my $start = $i++; ++$i while defined( $tally[ $i ] ) and $tally[ $i ] == $val and $i + % 1440; my $end = $i - 1; push @res, [ int2dhm( $start ), int2dhm( $end ), $val ]; } pp \@res;

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Do you concur?
        $ time perl 935755-buk.perl 935755.dat > 935755-buk.o real 3m21.523s user 3m19.391s sys 0m0.365s $ time perl 935755.perl 935755.dat > 935755.o real 1m24.273s user 1m23.753s sys 0m0.051s $ diff 935755-buk.o 935755.o 35c35 < ) --- > ) \ No newline at end of file
        Aye.
      Ah! I'm sorry if I left something out. It sounds like a question of whether or not [100200,100210,1],[100211,100220,1] should become [100200,100220,1], and the answer is yes, from a minimization perspective.

      I've been working on my code and will upload hopefully by tomorrow! Thanks again guys.