Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Surface fitting with PDL

by Xilman (Hermit)
on Aug 09, 2020 at 17:42 UTC ( [id://11120523] : perlquestion . print w/replies, xml ) Need Help??

Xilman has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Fellow Monks.

The background (pun intended) to my request is that I have some astronomical images in FITS format and I would like to remove a spatially varying background as the first stage of image enhancement. It seems clear to me that PDL is the way to go, though I am quite prepared to be convinced otherwise

CPAN contains some PDL modules which perform least squares fitting to one-dimensional data. I realise that surface fitting can be performed by re-arranging the data so that it appears to be linear (by storing it by rows or by columns) and that the independent variables can be munged to suit. However, that would appear to be inelegant and potentially costly for surface fitting to large image patches.

Surely someone must have encountered this situation before and has written code to deal with it?

I undoubtedly can write my own code but I am lazy. If I do need to write a function I will endeavour to make it publicly available.

Replies are listed 'Best First'.
Re: Surface fitting with PDL
by pryrt (Abbot) on Aug 09, 2020 at 21:17 UTC

    Because you invoked "least squares fitting", I assume what you want is a planar least-squares fitting in 3d, analogous to the linear least-squares in 2d, and that z is supposed be be linearly dependent on the others: ax+by+c = z

    You can see the algebra in this math.stackexchange answer, but basically you are solving A*X=B, where A is a matrix of the x and y data, X are the column-vector of the coefficients a,b,c from the above equation, and B is the column-vector of the z values for each x,y input.

    I haven't looked up the exact PDL syntax, but: An exact solution for 3 sets of (x,y,z) data would have PDL akin to $coeff_X = $matrix_A->inverse * $column_B;. But since you presumably have many points, not just three, since you invoked best-fit, you have to use the "left pseudo inverse", which would have a PDL implementation akin to $coeff_X = ($matrix_A->transpose * $matrix_A)->inverse * ($matrix_A->transpose) * $column_B;

    update: I was close.

    #!/usr/bin/perl use strict; use warnings; use PDL; use PDL::Matrix; # example: 2x + 3y + 4 = z, plug in known exact values; make sure my c +oefficients end up 2,3,4 my $matrix_A = mpdl [ [1,1,1], [3,7,1], [5,1,1], [1,5,1], ]; print "A => ", $matrix_A; print "B => ", my $column_B = vpdl [9,31,17,21]; print "X => ", my $coeff_X = (($matrix_A->transpose x $matrix_A)->inv +x $matrix_A->transpose) x $column_B;


    A => [ [1 1 1] [3 7 1] [5 1 1] [1 5 1] ] B => [ [ 9] [31] [17] [21] ] X => [ [ 2] [ 3] [ 4] ]

      Excellent! Many thanks.

      I actually wish to fit a higher order polynomial, perhaps up to a cubic with cross terms, but the generalization to be made to your code should be straightforward.

      Note for posterity: the go-to module for this stuff is PDL::LinearAlgebra. The obvious solver for Ax=B would be "msolve", and the notes for that give which LAPACK function is used "under the hood". For solving many times, you'd want to do a decomposition (quite possibly an LU) first, then reuse it. Take a look!

      LAPACK is a huge and powerful library for linear-algebra stuff, and getting to know it will be useful, regardless of which language environment you use - I don't think there is a single language environment that doesn't have a binding for it.

Re: Surface fitting with PDL
by jo37 (Deacon) on Aug 09, 2020 at 18:47 UTC

    I'm not an expert regarding PDL, but it's safe to say that re-arranging 2d data into 1d is a very cheap operation. There is no data movement involved - it's just the creation of a new index to the existing data.

    Just to give an impression:

    #!/usr/bin/perl use strict; use warnings; use feature 'say'; use PDL; my $p2d = sequence(3, 3); say $p2d; my $p1d = $p2d->clump(0, 1); say $p1d; $p1d->set(4, 42); say $p2d; __DATA__ [ [0 1 2] [3 4 5] [6 7 8] ] [0 1 2 3 4 5 6 7 8] [ [ 0 1 2] [ 3 42 5] [ 6 7 8] ]