in reply to Interpolating data slope for multiple points

This is a Perl forum, but I generated your graph in less than a minute with Excel. File|Open import data using space delimiter, select data, graph as a line graph (connect each data point with a straight line). I will point out that it is certainly not "cheating" to have Perl drive the Excel program to automate the above! But of course that will take longer than one minute to code. Another technique is to make an Excel macro and have Perl drive that macro command rather than the details of it.

BTW, your progress is amazing! And very consistent! Great!

To do this yourself, I would convert date/time to say epoch time. calculate min,max of X and Y scale. Then decide how many "buckets" you will have in each direction. Place each data point to closest X,Y coordinate that your scaling system allows. Calculate intermediate y points along the x axis between 2 points with y=mx+b. A line graph is better here than a bar.

Your data is so consistent that I don't see any need, but in some cases you would want to make a single line that represents the "best fit" of all the data points. One technique is least squares approximation. There is a way to get Excel and other spreadsheet programs to do that.

Update:
Oh, I see that this is some CGI thing. Ok, some code that shows a bit of a different way is below..hope it helps..

#!/usr/bin/perl -w use strict; use Time::Local; use Data::Dumper; my @data; my @graph; my $one_day = 60*60*24; #one day in seconds while (<DATA>) { next if /^\s*$/; #skip blank lines my ($date, $weight) = split; my $epoch = epoch($date); push (@data, [$epoch, $weight]); } # input data is sorted already sorted # But the algortihm depends upon sorted date information # so I did that to make sure # @data = sort{$a->[0] <=> $b->[0]} @data; # axis x will be adjusted to #days from the date of first data # axis y is in weight my ($x_base_epoch, $y1) = @{shift(@data)}; my $x1_day =0; foreach my $r_xy(@data) { my ($x2, $y2) = @$r_xy; $x2 -= $x_base_epoch; my $x2_day = int($x2/$one_day); # slope is delta weight/ delta days my $slope = ($y2-$y1)/($x2_day-$x1_day); # fill in missing data points... # by linear interpolation for (; $x1_day< $x2_day; $x1_day++) # x interval not tested for # other than 1 { my $y = $slope*($x1_day-$x2_day) + $y2; push @graph, [$x1_day,$y]; } $y1= $y2; } #fix-up for last data point my ($fx,$fy) = @{$data[-1]}; push @graph, [($fx-$x_base_epoch)/$one_day, $fy]; ### data to graph is in @graph ### ### I leave that part to the OP ### print "$_->[0] $_->[1]\n" foreach @graph; sub epoch { my $date = shift; my ($month, $day, $year) = split(m|/|,$date); my $time = timelocal(0,0,0, $day, $month-1, $year-1900); return $time; } =prints 0 334 1 333.527659574468 2 333.055319148936 3 332.582978723404 4 332.110638297872 ..... 74 298.95 75 298.6 76 298.32 77 298.04 78 297.76 79 297.48 80 297.2 81 295.4 82 293.6 =cut __DATA__ 6/26/2010 334 8/12/2010 311.8 8/19/2010 308.4 9/5/2010 300.0 9/9/2010 298.6 9/14/2010 297.2 9/16/2010 293.6
Update to Update:

Using a different x axis increment than "one" can get to problematic depending upon what you are trying to do. Often one would want to show the actual data points with some special character and the points inbetween with another character. So if you just want to show say weekly progress, then you have to decide what that would mean in terms of the graphical representation - eg the "weekly Monday point" may not represent any factual point at all. Generating the "weight" for each day is very efficient. If you want something like a "Monday" value it would not be ridiculous to generate all points in the week or year and then just print every 7th value.

Replies are listed 'Best First'.
Re^2: Interpolating data slope for multiple points
by GrandFather (Saint) on Sep 17, 2010 at 05:49 UTC

    Hints in the first two lines of the OP's sample code imply that leveraging Excel may not be appropriate:

    #!/usr/bin/perl -wT use CGI::Carp qw/fatalsToBrowser warningsToBrowser/;
    True laziness is hard work
      Thanks! Missed that part...I updated my post with another example of code for this job. The OP's statement: I've lost a lot of weight lately, and wanted to plot it on a graph. seemed at first glance to imply something something different than a cgi application that dynamically creates graphs. So as the devil's advocate, this may still very well be a case where update data locally, generate cool image locally and use Perl to automagically upload cool image to website may still be a good way to go.
Re^2: Interpolating data slope for multiple points
by oko1 (Deacon) on Sep 17, 2010 at 15:29 UTC

    {laugh} Excel would be a bit of overkill, but thank you for the suggestion. As far as the progress goes, it hasn't even been all that difficult: I've simply been experimenting with what it's like to feel different degrees of hunger, just being curious about it. I realized that it's been years since I've actually felt hungry (yeah... sounds crazy, I know), and I know that it won't kill me to experience it. As a side benefit, food now tastes absolutely amazing, and I get really full on a dollar's worth of salad greens. It feels fantastic.

    Regarding your code: bravo! That's pretty much what I was looking for - it's very readable, and the algorithm even makes sense to me. I've been playing around with my own code in the meantime, and came up with something more-or-less similar (except I handle the _end_ case as the special one.)

    use POSIX qw/strftime/; use strict; $|++; use constant Intervals => 20; my ($prev_time, $prev_val, $tmin, $tmax, $last_val, @curve); open my $data, "weightdata.txt" or die "weightdata.txt: $!\n"; while (<$data>){ chomp; next unless m{^\s*([\d/]+)\s+([\d.]+)}; my($m, $d, $y) = split /\//, $1; my $time_point = strftime("%s", 0, 0, 0, $d, $m - 1, $y - 1900); if (defined $prev_time){ my $slope = ($2 - $prev_val) / ($time_point - $prev_time); push @curve, [ $prev_time, $time_point, $prev_val, $slope ]; } ($prev_time, $prev_val) = ($time_point, $2); ($tmax, $last_val) = ($time_point, $2); } close $data; $tmin = $curve[0][0]; my $interval = ($tmax - $tmin) / (Intervals - 1); my ($prev_t, $val, $slope); for my $t (0 .. Intervals - 2){ my $calc_t = $t * $interval + $tmin; shift @curve unless $calc_t >= $curve[0][0] && $calc_t <= $curve[0 +][1]; ($prev_t, $val, $slope) = @{$curve[0]}[0, 2, 3]; printf "%2d: Weight on %s was %.1f\n", $t + 1, strftime("%F",local +time($calc_t)), ($calc_t - $prev_t) * $slope + $val; } printf "%2d: Weight on %s was %.1f\n", Intervals, strftime("%F",localt +ime($tmax)), $last_val;

    I've even looked at the intervals carefully - they're consistent - and the change in weight from one plotted point to another (yep, looks right.) I just might be getting the hang of this process. :)

    Thanks very much for your response; I'm definitely finding all this highly educational.


    --
    "Language shapes the way we think, and determines what we can think about."
    -- B. L. Whorf