Greetings,

I've created a utility using Statistics::LineFit and another using Gnuplot and fed both the same sample data. The results differ, so I must have made a mistake, but I can't see where.

Perl code

#!/usr/bin/perl use strict; use warnings; use Statistics::LineFit; use Time::Local; use Data::Dumper; my @x_axis; my @y_axes; sub date_to_epoch { my $date = shift; my ( $y, $m, $d ) = split /-/, $date; return timelocal( '59', '59', '23', $d, $m, $y ); #return timelocal( '0', '0', '0', $d, $m, $y ); } sub max_value { my @array = @_; my $max = $array[0]; for ( my $i = 0; $i <= $#array; $i++ ) { $max = $array[$i] if ( $array[$i] > $max ); } return $max; } my @epochs; while (<DATA>) { next if ( m/^#/ ); chomp; if ( my @line = split /\s+/ ) { my $epoch = date_to_epoch( $line[0] ); # factor down epoch or slope is too shallow. push @x_axis, $epoch; shift @line; for ( my $y = 0; $y <= $#line; $y++ ) { push @{$y_axes[$y]}, $line[$y] ; } } } print Dumper ( \@x_axis ); print Dumper ( \@y_axes ); my $lineFit = Statistics::LineFit->new( 0, 0 ); # TODO change 2nd to 1 $lineFit->setData( \@x_axis, \@{$y_axes[0]} ) or die "Invalid regressi +on data\n"; my ( $intercept, $slope ) = $lineFit->coefficients(); print "Slope(m): $slope Y-intercept(b): $intercept\n"; my %fitline; $fitline{y1} = $intercept; $fitline{x1} = 0; $fitline{y2} = max_value( @{$y_axes[0]} ); $fitline{x2} = ( $fitline{y2} - $fitline{y1} ) / $slope + $fitline{x1} +; print Dumper ( \%fitline ); __DATA__ # date notkept hosts 2014-04-01 50 10 2014-04-02 63 11 2014-04-03 120 12 2014-04-04 55 20 2014-04-05 60 22 2014-04-06 63 25 2014-04-07 52 24

Gnuplot

#!/usr/bin/gnuplot #set output "test.png" set title "Promises not kept" set xlabel "Date" set ylabel "Count" set rmargin 7 set border linewidth 2 set style line 1 linecolor rgb 'blue' linetype 1 linewidth 2 set style line 2 linecolor rgb 'black' linetype 1 linewidth 2 set style fill solid set xdata time set timefmt "%Y-%m-%d" set format x "%Y-%m-%d" set grid front set grid set autoscale # 1e8 reduces the epoch seconds for a less flat line. h(x) = m2 * x + b2 fit h(x) 'test.dat' using 1:3 via m2,b2 p(x) = m1 * x + b1 fit p(x) 'test.dat' using 1:2 via m1,b1 #set terminal png enhanced size 1024,768 plot 'test.dat' using 1:2 title 'Promises not kept' with boxes lc rgb +"orange", \ p(x) title 'Promise Trend' with lines linestyle 1, \ h(x) title 'Host Trend' with lines linestyle 2

test.dat

# date notkept hosts 2014-04-01 50 10 2014-04-02 63 11 2014-04-03 120 12 2014-04-04 55 20 2014-04-05 60 22 2014-04-06 63 25 2014-04-07 52 24

Perl results

$VAR1 = [ 1399003199, 1399089599, 1399175999, 1399262399, 1399348799, 1399435199, 1399521599 ]; $VAR1 = [ [ '50', '63', '120', '55', '60', '63', '52' ], [ '10', '11', '12', '20', '22', '25', '24' ] ]; Slope(m): -2.23214285714286e-05 Y-intercept(b): 31299.6785491071 $VAR1 = { 'y1' => '31299.6785491071', 'x2' => 1396849599, 'y2' => 120, 'x1' => 0 };

Gnuplot results

Final set of parameters Asymptotic Standard Error ======================= ========================== m1 = 1.44796e-07 +/- 5.823e-05 (4.022e+04%) b1 = 1 +/- 2.62e+04 (2.62e+06%) correlation matrix of the fit parameters: m1 b1 m1 1.000 b1 -1.000 1.000

Note that m1 and b1 from gnuplot are not the same as Slope and Y-intercept from Perl. Why?

Neil Watson
watson-wilson.ca


In reply to Statistics::LineFit versus gnuplot, results differ by neilwatson

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.