mathsy question: finding max of curve from discrete points

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: mathsy question by zby (Vicar) on May 07, 2003 at 10:56 UTC
You cannot say if there is a peak outside the datapoints if you don't have any additional information about the function. You have to confine the possible functions to some special class to have an exact solution. For instance - you can assume that it is a polynomial - than there is exactly one polynomial with the lowest range that will fit to the data. Still finding it's extrema might be a bit difficult for range greater than 5 (you would have to solve the derivative of the polynomial).	[reply]
Re: Re: mathsy question by Anonymous Monk on May 07, 2003 at 11:36 UTC
Are you sure it can't be done by fitting a curve to the line with quadratic regression?? (y = ax^2 + bx + c). Then with this plot lots of data-point near the peak (which lie of the line of best fit) and then read the highest. ;-0	[reply]
Re: Re: Re: mathsy question by zby (Vicar) on May 07, 2003 at 11:46 UTC
Quadratic regression is quite a different thing - it is an approximation by a range 2 polynomial. I thought you needed an exact solution. Unfortunately I don't know approximation techniques.	[reply]
Re: Re: Re: mathsy question by Itatsumaki (Friar) on May 07, 2003 at 15:51 UTC
I am not quite sure what you are suggesting? If you intend him to regress the coefficients a,b,c then this is just general linear regression. Why would it be quadratic rather than linear?	[reply]
Re: mathsy question by tall_man (Parson) on May 07, 2003 at 15:11 UTC
You can fit a curve of the form "Ax^2 + Bx + C = y" to the data. The package Statistics::Regression can be used to do it. (Make sure you edit Regression.pm to turn off DEBUGGING first): use strict; use Statistics::Regression; my $reg= Statistics::Regression->new( 3, "quad regression", [ "const", + "X", "X^2" ] ); my @array = ('62.6', '62.8', '63', '63.3', '63.5', '63. +7', '64'); my @array2 = ('7.476', '7.5219', '7.5464', '7.5516', '7.5504', ' +7.5372', '7.518'); my $i; for $i (0..$#array) { my $x = $array[$i]; my $x2 = $x * $x; my $y = $array2[$i]; $reg->include($y, [1.0, $x, $x2]); } my $theta = $reg->theta(); print "Coefficients: ",join(",",@$theta),"\n"; my $c = $theta->[0]; my $b = $theta->[1]; my $a = $theta->[2]; print "\nComputed results:\n"; for $i (0..$#array) { my $x = $array[$i]; my $y = ($a$x + $b)$x + $c; my $ytrue = $array2[$i]; print "$x\t$y\t($ytrue)\n"; } # Solve for the derivative being 0. my $maxx = -$b/(2.0$a); my $maxy = ($a$maxx + $b)*$maxx + $c; print "\nPeak point:\n"; print "$maxx\t$maxy\n"; [download] Don't count on a very accurate fit with so few data points. Once you have a quadratic equation, solve for the peak using "2Ax + B = 0". Again, don't count on great accuracy with this method. Here is what I got: `Coefficients: -451.490758046837,14.4832720403469,-0.11423976795752 Computed results: 62.6 7.48383859766494 (7.476) 62.8 7.51535962535968 (7.5219) 63 7.53774147161789 (7.5464) 63.3 7.55417827581152 (7.5516) 63.5 7.55371216847811 (7.5504) 63.7 7.54410687970818 (7.5372) 64 7.51256298135968 (7.518) Peak point: 63.3897998012935 7.5550995057929` [download]	[reply] [d/l] [select]
Re: mathsy question: finding max of curve from discrete points (two steps) by tye (Sage) on May 07, 2003 at 17:05 UTC
First, you need to "fit" a curve to your data points. There are pretty standard methods for doing this (and they all boil down to linear approximation and they got really boring when I was studying them) by building a function out of a bunch of "basis" functions. Note that polynomials pretty much suck for this type of thing. This process is called "fitting" a curve to a set of points. A quick CPAN search for fit turned up PDL::Fit modules that can do this. It looks like they require you to provide your own set of basis functions (or use polynomials which, unfortunately, suck), which is nice if they'd at least let you pick from a few of the classic sets that work well for fitting curves to data. When I was getting out of this field, they were just realizing that "wave packet" functions were exceptionally good for "fitting". I'd think that a bit more research, probably in the direction of PDL, would turn up a good source of functions to fit to your data. Once you've got your basis functions and have fit your data, you'll have a function that defines a curve. Finding a local maximum of a curve is a classic problem. If you don't also have the derivative of your function, then a rather simple technique is to start with a $x and $step: `sub findMax { my( $f, $x, $step, $eps )= @_; while( 1 ) { my $y1= $f->( $x ); $x += $step; # March forward one more step. my $y2= $f->( $x ); if( $y2 < $y1 ) { # Oops, marched past it: $step /= -2; # Turn around, smaller steps. if( $step < $eps ) { # Close enough: return $x + $step; # Midpoint of last two steps. } } } } my $func= sub { ... }; my $x= findMax( $func, 63.3, 0.1, 0.000001 ); print "Max of ", $func->($x), " at $x.\n";` [download] I suspect that other methods are provided by things like PDL, though I didn't run into any with a quick search. - tye	[reply] [d/l]
Re: Re: mathsy question: finding max of curve from discrete points (two steps) by tall_man (Parson) on May 07, 2003 at 19:48 UTC
Thanks for the pointer to the PDL package, tye. Here is the polynomial fit I did above converted to use piddles. But as you say, polynomials may not be the best thing to use. use strict; use PDL::LiteF; use PDL::Fit::Polynomial; my $xp = pdl (62.6, 62.8, 63, 63.3, 63.5, 63.7, 64) +; my $yp = pdl (7.476, 7.5219, 7.5464, 7.5516, 7.5504, 7.5372, 7. +518); my ($yfit,$coeffs) = fitpoly1d($xp,$yp,3); my $diffp = $yp - $yfit; my @theta = list $coeffs; print "Coefficients: ",join(",",@theta),"\n"; my $c = $theta[0]; my $b = $theta[1]; my $a = $theta[2]; print "\nComputed results:\n"; my $xtop = $xp->nelem - 1; for my $i (0..$xtop) { my $x = at($xp,$i); my $y = at($yfit,$i); my $ytrue = at($yp,$i); my $diff = at($diffp,$i); print "$x\t$y\t($ytrue)\t$diff\n"; } ## Solve for the derivative being 0. my $maxx = -$b/(2.0$a); my $maxy = ($a$maxx + $b)*$maxx + $c; print "\nPeak point:\n"; print "$maxx\t$maxy\n"; [download]	[reply] [d/l]
Re: mathsy question by UnderMine (Friar) on May 07, 2003 at 13:10 UTC
I had a similar problem some time ago but it was related to goal seeking. If you have a function then Math::Brent will help you find a solution to it. Math::Polynomial has methods of creating such functions. Hope it helps UnderMine	[reply]
Re: Re: mathsy question by no_slogan (Deacon) on May 07, 2003 at 15:13 UTC
Math::Polynomial has methods of creating such functions. Specifically, Math::Polynomial::interpolate. It may not be such a good idea to fit your entire data set with a polynomial and then maximize it. For one, it can be slow. For two, the high-order polynomials found by fitting many points show a lot of "stiffness" which makes them bad for fitting unsmooth functions (where "smooth" is defined as "similar to a polynomial") -- whether this is a problem for you depends on your data set. Be careful with Brent's method, also, because it can get stuck in a local maximum. For quadratic interpolation, you can take the maximum data point and its two neighbors and fit them with Math::Polynomial. You can easily find the maximum of the resulting parabola by setting the derivative equal to zero. No need to use Math::Brent. Update: I realized that I was confused between regression and interpolation right after hitting submit. Quadratic regression, as demontrated by tall_man, is probably a good idea if your data is all fairly close to the peak.	[reply]
Re: mathsy question by gri6507 (Deacon) on May 07, 2003 at 13:53 UTC
Without having any more info about the points and their interaction, scientifically speaking, you have to connect the points using straight lines, which, by definition, produces a graph with its maximum on one of such a graph. With this in mind, you can use the following method to find the maximum point: Pick two points, and find the slope of the connecting line. Compare this slope to the slope of the line connecting point 2 with point 3. Continue until the slope turns negative. The last point still giving you a positive slope is the maximum of your data set This, of course, is an algorithm approximating a derivative, which is what you would want to do to find a maximum of a graph in the first place. However, since, as I've explained, the maximum will have to be on a data point, you should simply be able to sort the data points and get the maximum that way. I hope this helps at least somewhat.	[reply]
Re: Re: mathsy question by zby (Vicar) on May 07, 2003 at 15:26 UTC
Heh - a simple sort on the data would be equivalent to that algorithm (if you don't mind being a O log n slower). Update: Sorry - you say exactly that in the last sentence.	[reply]
Re: mathsy question: finding max of curve from discrete points by thor (Priest) on May 07, 2003 at 18:39 UTC
From what I remember of linear algebra, a quadratic solution can be found as follows (sorry, this is going to be poorly formatted; I don't know how to combine super/subscripts and pre tags): Given (x₁,y₁), (x₂,y₂)...(x_n, y_n) are your data points, a least squares solution can be found by creating a matrix A as such: x₁² x₁ 1 x₂² x₂ 1 ... x_nⁿ x_n 1 and a vector b as such: y₁ y₂ ... y_n We then want to solve for a vector x = (A, B, C) such that: A^TAx = A^Tb. (Where A^T is the matrix transpose of A). The A, B, and C that you find by solving that equation will be the coefficients in the quadratic polynomial y = Ax² + Bx + C. If you know how to do matrix arithmetic, you should be able to swing it. I think that there are also modules on the CPAN that could help you with that. thor	[reply]