I'm trying to write a quick script fragment to find the Least Squares Regression line of some data points. To find the line of best fit y = ax + b (through the above mentioned method) with given a set of data {(x1,y1),(x2,y2),...,(xn,yn)}, the following system is solved for:

(The math is hard to write, and therefore to read, but not fundamental to reading the rest of this)

EQ1: nb + SUM(xi,i=1 TO n)a = SUM(yi,i=1 TO n)
EQ2: SUM(xi,i=1 TO n)b + SUM(xi^2,i=1 TO n)a = SUM(xi*yi, i=1 TO n)

The (x,y) points I need (quite a few of them!) are read in line by line, so the problem comes up with storage; it requires far too much space to store these, so I decided to make a closure (with the hope of) summing as we go along, and then (when passed the right argument) calculating the data (actually, I was rather happy with the code I wrote to solve the sytem of equations; when I looked at it I didn't know what to do, and thought that it was the type of thing complex C code has been written for. . . but where there's perl there's a way, and eval popped in my head, but I'd be interested to hear other ideas).

I'm not familiar with closures (I haven't used them often), so I looked through this site and at perldoc's closure information (including perlref). Well, even after all that I'm still getting some problems. Now, I imagine its a simple (as in basic/fundamental) closure issue which I'm missing, but when I call the closure (&$closure($args)) and through a debugging statement print the arguments, I get . . . NOTHING.

If closures aren't the way to go, fine (but I'd be interested in knowing what to do). The code is below (just the closure; I just made up some simple data through a 1..10 and 20..30 statement for x y coordinates and tested it with that; the call mimicks what I have above).

I wrote this just now quickly in an attempt to see what I'll need to do with the final thing, so any suggestions are more than welcome.

# bf stands for best fit; eventually I'll need several that # will mimick this format closely. my $bf_1 = sub { # Arguments are: Add|Equate; [x value; y value]. my $sum_x; my $sum_y; my $sum_x2; my $sum_xy; my $n; $sum_x = 0 unless $sum_x; $sum_y = 0 unless $sum_y; $sum_x2 = 0 unless $sum_x2; $sum_xy = 0 unless $sum_xy; $n = 0 unless $n; if($ARGV[0] eq 'Add') { my @args = @_; $sum_x += $args[1]; $sum_y += $args[2]; $sum_x2 += $args[1]*$args[2]; $sum_xy += $args[1] ** 2; $n++; } elsif ($ARGV[0] eq 'Equate') { my $a; my $b; $a = "(($sum_y - $n*$b)/$sum_x)"; $b = "($sum_xy-$sum_x2*$a)/$sum_x"; eval $b; eval $a; ($a,$b); } else { print "Called with unkown argument\n" } };

In reply to Closures and Statistics by dimmesdale

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.