Re: Perl Drag Racing - sum/delta list
by Aristotle (Chancellor) on Aug 15, 2002 at 00:08 UTC
|
sub decode2 {
my $total = shift;
my $delta;
my $diff;
foreach (@_) {
$delta += $_;
$diff += $delta;
}
my $original= ($total-$diff)/(@_+1);
my $value= $original;
my @results= map { $value += $_ } @_;
return ($original, @results);
}
We realize that we can simplify the first loop if we simply subtract the first element from $original later:
sub decode2 {
my $total = shift;
my $diff;
$diff += $_ for @_;
my $original= ($total-$diff)/(@_+1) - $_[0];
my $value= $original;
my @results= map { $value += $_ } @_;
return ($original, @results);
}
Now we realize we can rearrange the return if we include the $original in the map:
sub decode2 {
my $total = shift;
my $diff;
$diff += $_ for @_;
my $original= ($total-$diff)/(@_+1) - $_[0];
my $next;
my @results= map { $next += $_ } $original, @_;
return @results;
}
This also lets us get rid of the intermediary @results array altogether.
Next we realize that we needn't shift off the total, because rather than summing everything into $diff first and subtracting that from $total later, we can simply turn the addition loop into a subtraction loop. For that, we have to flip the sign on the total. Note that not shifting off the total affects many locations in the code.
sub decode2 {
my $dc;
$_[0] = -$_[0];
$dc -= $_ for @_; # subtract the sum of all elements from the firs
+t
my $original= $dc/@_ - $_[1];
my $next;
return map { $next += $_ } $original, splice @_, 1;
}
$dc is an inaccurate name of course.. the DC component for the values is actually $dc/@_ - but oh well.
Some logical cleanup: rather than declaring a new variable and splicing from @_, we can just modify the first value inplace. And thus, a massive cleanup later, we have
sub decode2 {
my ($dc, $next);
$_[0] = -$_[0];
$dc -= $_ for @_; # subtract the sum of all elements from the firs
+t
$_[0] = $dc/@_ - $_[1];
return map { $next += $_ } @_;
}
____________ Makeshifts last the longest. | [reply] [d/l] [select] |
|
|
Aristotle++
It is so instructive to see a worked example of golfing, especially on something small enough to be able to follow and big enough to allow for some real transformations.
However, I still don't follow the logic of totalling stuff up and then having to break it down in order to retrieve the first value before applying the deltas, instead of just retaining the first value itself?
Did I miss something?
What's this about a "crooked mitre"? I'm good at woodwork!
| [reply] |
|
|
| [reply] |
|
|
|
|
|
|
|
| [reply] [d/l] |
|
|
On your first change, you appear to be using $diff with the same meaning. But, subtracting out $original later doesn't do the same thing as calculating the running $delta. I suppose it's a quirk that the example gave the same results!!
If the total is a+b+c+d, then the given data is (total, b-a, c-b, d-c). This means that the missing "original" a is total- (a+(b-a) + (a+(b-a))+(c-b) + ((a+(b-a))+(c-b))+(d-c))) divided by the number of elements.
Hmm, cancelling out, I see (b + c + d), but you are computing ( (b-a) + (c-b) + (d-c) ), the sum of the listed deltas. That gives you (d-a) which is the unknown first value (a) subtracted from the last value, which we don't know either.
It is a total coincedecne that the sum of the deltas happens to be the same as the difference between the first and last result elements.
| [reply] |
|
|
You are right, I didn't do the math on paper originally, and due to false laziness, lack of an existing encode() or motivation to write one made me not verify my code with different data, even though I knew that might come back and bite me in the tender parts.
I fixed the code; this time, no comments, since I don't know how to explain it briefly. I guess it warrants a longer comment in front of the subroutine containing the math.
sub decode3 {
my ($diff, $delta, $next);
$diff = $_[0];
$delta = -$_[0];
$diff -= $delta += $_ for @_;
$_[0] = $diff/@_;
return map { $next += $_ } @_;
}
Your tests verify this one, and the benchmark on my 5.6.1 says it is consistently about 42% faster than the for version and 35% faster than the map one. The main win was removing the intermediary @results. If I put that step back in, it only beats the vanilla versions by 12% and 8%, respectively, which means you can speed up your own versions by 25% if you omit the temporary array.
The rest of the speed difference is mainly because I avoid a couple off-by-one corrections by not shifting off the total. It is debatable whether the loss of clarity (or the requirement for comments..) is worth the speed gains. Some profiling would be in order to determine the impact of decode()'s speed on overall program perforamce.
Makeshifts last the longest. | [reply] [d/l] |
|
|
|
|
Re: Perl Drag Racing - sum/delta list
by BrowserUk (Patriarch) on Aug 14, 2002 at 22:53 UTC
|
I'm probably missing something but it seems to me you are making life hard by calculating that running total or whatever it is?
Why not just store the original first value and the deltas?
#! perl -w
use strict;
sub decode { my $val; return map { $val += $_ } @_; }
my @input = qw( 1000 1 -21 44 -1 );
my @output = decode @input;
print "@output", $/;
__END__
C:\test>190239
1000 1001 980 1024 1023
What's this about a "crooked mitre"? I'm good at woodwork! | [reply] [d/l] |
|
|
Why? It's for a compression archive format. I'm renovating my design and have the following (re "solid packing of small files"):
The total size of a packet is already known. (thus the total size). If this is a concatenation of subpackets, we need to know where to separate them back out. Instead of putting the length of the subpacket before each, it offers better compression to put this "index" data elsewhere and not "contaminate" the data with values that are not in its normal range. Also, there is an elegance factor of uniformity about putting this index in its own field, not mixed throughout.
So, how to store the index? The delta-from-previous is smaller than a simple list of lengths, because it can use a shorter encoding form if the sizes are similar. But since the total size is already known, we only need the deltas of all except one, and can leave out one value.
—John
| [reply] |
Re: Perl Drag Racing - sum/delta list
by Abigail-II (Bishop) on Aug 15, 2002 at 10:02 UTC
|
While in some cases there's an advantage of storing the
increments instead of the values (it can save space - a
technique also used in sampling (but there's not much to
gain in Perl, except when your numbers would have been
BigInts and your increments aren't)), I don't see the advantage
of your complicated first value. Why not add an imaginary
0 before the orginal list and store just the differences?
(That means, the first value is the first value). The you
only need one pass in both the decoding and encoding.
Abigail | [reply] |
|
|
See my reply to BrowserUk, who asked a similar question.
Basically, the whole idea is to save space; the more the better (by definition). The numbers, even though "normal" sized, are like BigInts variable-sized in their encoded form, and 0 and other small values have the minimum 1-byte length.
Since I already know the total size, I can remove one value from the delta list and still figure it out again.
So the answer to why is "because I can". The requirement is for minimum space and demonstratable (not optimal) logic to deal with it.
Thanks,
—John
| [reply] |