Perl Drag Racing - sum/delta list

John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Perl Drag Racing - sum/delta list by Aristotle (Chancellor) on Aug 15, 2002 at 00:08 UTC
We start out: `sub decode2 { my $total = shift; my $delta; my $diff; foreach (@_) { $delta += $_; $diff += $delta; } my $original= ($total-$diff)/(@_+1); my $value= $original; my @results= map { $value += $_ } @_; return ($original, @results); }` [download] We realize that we can simplify the first loop if we simply subtract the first element from `$original` later: `sub decode2 { my $total = shift; my $diff; $diff += $_ for @_; my $original= ($total-$diff)/(@_+1) - $_[0]; my $value= $original; my @results= map { $value += $_ } @_; return ($original, @results); }` [download] Now we realize we can rearrange the `return` if we include the `$original` in the `map`: `sub decode2 { my $total = shift; my $diff; $diff += $_ for @_; my $original= ($total-$diff)/(@_+1) - $_[0]; my $next; my @results= map { $next += $_ } $original, @_; return @results; }` [download] This also lets us get rid of the intermediary `@results` array altogether. Next we realize that we needn't `shift` off the total, because rather than summing everything into `$diff` first and subtracting that from `$total` later, we can simply turn the addition loop into a subtraction loop. For that, we have to flip the sign on the total. Note that not shifting off the total affects many locations in the code. `sub decode2 { my $dc; $_[0] = -$_[0]; $dc -= $_ for @_; # subtract the sum of all elements from the firs +t my $original= $dc/@_ - $_[1]; my $next; return map { $next += $_ } $original, splice @_, 1; }` [download] `$dc` is an inaccurate name of course.. the DC component for the values is actually `$dc/@_` - but oh well. Some logical cleanup: rather than declaring a new variable and splicing from `@_`, we can just modify the first value inplace. And thus, a massive cleanup later, we have `sub decode2 { my ($dc, $next); $_[0] = -$_[0]; $dc -= $_ for @_; # subtract the sum of all elements from the firs +t $_[0] = $dc/@_ - $_[1]; return map { $next += $_ } @_; }` [download] `____________` Makeshifts last the longest.	[reply] [d/l] [select]
Re: Re: Perl Drag Racing - sum/delta list by BrowserUk (Patriarch) on Aug 15, 2002 at 06:33 UTC
Aristotle++ It is so instructive to see a worked example of golfing, especially on something small enough to be able to follow and big enough to allow for some real transformations. However, I still don't follow the logic of totalling stuff up and then having to break it down in order to retrieve the first value before applying the deltas, instead of just retaining the first value itself? Did I miss something? What's this about a "crooked mitre"? I'm good at woodwork!	[reply]
Re^3: Perl Drag Racing - sum/delta list by Aristotle (Chancellor) on Aug 15, 2002 at 16:37 UTC
Sorry, I don't understand what you're referring to. Could you quote a specific part of the post/code? Makeshifts last the longest.	[reply]
Re: Re^3: Perl Drag Racing - sum/delta list by BrowserUk (Patriarch) on Aug 15, 2002 at 17:19 UTC
Re^5: Perl Drag Racing - sum/delta list by Aristotle (Chancellor) on Aug 16, 2002 at 00:55 UTC
Some notes below your chosen depth have not been shown here
Re: Re: Perl Drag Racing - sum/delta list by John M. Dlugosz (Monsignor) on Aug 16, 2002 at 15:31 UTC
Hmm, cool, but it doesn't check. I put in (8636 460 17372 242 ...) and got out (28501.5412844037 20325.5412844037 37237.5412844037 20107.5412844037 ...) Here is my complete program. Since it failed on a longer test vector, it might be an overflow or something? Read more... (3 kB)	[reply] [d/l]
Results? by John M. Dlugosz (Monsignor) on Aug 16, 2002 at 16:01 UTC
On your first change, you appear to be using $diff with the same meaning. But, subtracting out $original later doesn't do the same thing as calculating the running $delta. I suppose it's a quirk that the example gave the same results!! If the total is a+b+c+d, then the given data is (total, b-a, c-b, d-c). This means that the missing "original" a is total- (a+(b-a) + (a+(b-a))+(c-b) + ((a+(b-a))+(c-b))+(d-c))) divided by the number of elements. Hmm, cancelling out, I see (b + c + d), but you are computing ( (b-a) + (c-b) + (d-c) ), the sum of the listed deltas. That gives you (d-a) which is the unknown first value (a) subtracted from the last value, which we don't know either. It is a total coincedecne that the sum of the deltas happens to be the same as the difference between the first and last result elements.	[reply]
Re^3: Perl Drag Racing - sum/delta list by Aristotle (Chancellor) on Aug 17, 2002 at 17:24 UTC
You are right, I didn't do the math on paper originally, and due to false laziness, lack of an existing `encode()` or motivation to write one made me not verify my code with different data, even though I knew that might come back and bite me in the tender parts. I fixed the code; this time, no comments, since I don't know how to explain it briefly. I guess it warrants a longer comment in front of the subroutine containing the math. `sub decode3 { my ($diff, $delta, $next); $diff = $_[0]; $delta = -$_[0]; $diff -= $delta += $_ for @_; $_[0] = $diff/@_; return map { $next += $_ } @_; }` [download] Your tests verify this one, and the benchmark on my 5.6.1 says it is consistently about 42% faster than the `for` version and 35% faster than the `map` one. The main win was removing the intermediary `@results`. If I put that step back in, it only beats the vanilla versions by 12% and 8%, respectively, which means you can speed up your own versions by 25% if you omit the temporary array. The rest of the speed difference is mainly because I avoid a couple off-by-one corrections by not shifting off the total. It is debatable whether the loss of clarity (or the requirement for comments..) is worth the speed gains. Some profiling would be in order to determine the impact of `decode()`'s speed on overall program perforamce. Makeshifts last the longest.	[reply] [d/l]
Re^4: Perl Drag Racing - sum/delta list by John M. Dlugosz (Monsignor) on Aug 19, 2002 at 19:02 UTC
Re^5: Perl Drag Racing - sum/delta list by Aristotle (Chancellor) on Aug 19, 2002 at 23:36 UTC
Re: Perl Drag Racing - sum/delta list by BrowserUk (Patriarch) on Aug 14, 2002 at 22:53 UTC
I'm probably missing something but it seems to me you are making life hard by calculating that running total or whatever it is? Why not just store the original first value and the deltas? `#! perl -w use strict; sub decode { my $val; return map { $val += $_ } @_; } my @input = qw( 1000 1 -21 44 -1 ); my @output = decode @input; print "@output", $/; __END__ C:\test>190239 1000 1001 980 1024 1023` [download] What's this about a "crooked mitre"? I'm good at woodwork!	[reply] [d/l]
Re: Re: Perl Drag Racing - sum/delta list by John M. Dlugosz (Monsignor) on Aug 16, 2002 at 15:04 UTC
Why? It's for a compression archive format. I'm renovating my design and have the following (re "solid packing of small files"): The total size of a packet is already known. (thus the total size). If this is a concatenation of subpackets, we need to know where to separate them back out. Instead of putting the length of the subpacket before each, it offers better compression to put this "index" data elsewhere and not "contaminate" the data with values that are not in its normal range. Also, there is an elegance factor of uniformity about putting this index in its own field, not mixed throughout. So, how to store the index? The delta-from-previous is smaller than a simple list of lengths, because it can use a shorter encoding form if the sizes are similar. But since the total size is already known, we only need the deltas of all except one, and can leave out one value. —John	[reply]
Re: Perl Drag Racing - sum/delta list by Abigail-II (Bishop) on Aug 15, 2002 at 10:02 UTC
While in some cases there's an advantage of storing the increments instead of the values (it can save space - a technique also used in sampling (but there's not much to gain in Perl, except when your numbers would have been BigInts and your increments aren't)), I don't see the advantage of your complicated first value. Why not add an imaginary 0 before the orginal list and store just the differences? (That means, the first value is the first value). The you only need one pass in both the decoding and encoding. Abigail	[reply]
Re: Re: Perl Drag Racing - sum/delta list by John M. Dlugosz (Monsignor) on Aug 16, 2002 at 18:02 UTC
See my reply to BrowserUk, who asked a similar question. Basically, the whole idea is to save space; the more the better (by definition). The numbers, even though "normal" sized, are like BigInts variable-sized in their encoded form, and 0 and other small values have the minimum 1-byte length. Since I already know the total size, I can remove one value from the delta list and still figure it out again. So the answer to why is "because I can". The requirement is for minimum space and demonstratable (not optimal) logic to deal with it. Thanks, —John	[reply]