Re: Abstract image registration or feature detection
by Corion (Patriarch) on Jul 01, 2022 at 15:00 UTC
|
The first step would be to identify the algorithm suitable to your set of features.
I'm not sure what kind of input data and training data you have. Maybe Scale-invariant feature transform ("SIFT") is already enough? I have a smallish implementation in Image::CCV, but most likely once you have the algorithm, you can find Perl bindings for whatever library implements it.
| [reply] |
Re: Abstract image registration or feature detection
by pryrt (Abbot) on Jul 02, 2022 at 18:14 UTC
|
I am hoping that etj will come by and make this more idiomatic PDL. But given that I'm a PDL hack, this is what I can come up with, by taking the approximate coordinates from the image supplied and doing some manual math using PDL:
#!perl -l
use 5.012; # //, strict, say
use warnings;
use PDL;
use PDL::Matrix;
print my $red_in = (mpdl [[58,48], [108,155], [186,80], [255,191], [3
+31,48]])->transpose->append(ones(1));
print my $red_out = (mpdl [[471,15], [531,141], [603,90], [682,227], [
+747,107]])->transpose->append(ones(1));
print my $blue_in = (mpdl [[125,73], [197,158], [282,94]])->transpose-
+>append(ones(1));
print my $blue_out_for_checking = (mpdl [[542,64], [622,175], [701,138
+]])->transpose->append(ones(1));
print for
my $sx = sum( $red_in->slice(0) ),
my $sy = sum( $red_in->slice(1) ),
my $s1 = sum( $red_in->slice(2) ),
my $sxx = sum( $red_in->slice(0)**2 ),
my $syy = sum( $red_in->slice(1)**2 ),
my $sxy = sum( $red_in->slice(0)*$red_in->slice(1) ),
my $sv = sum( $red_out->slice(0) ),
my $sw = sum( $red_out->slice(1) ),
my $sxv = sum( $red_in->slice(0)*$red_out->slice(0) ),
my $sxw = sum( $red_in->slice(0)*$red_out->slice(1) ),
my $syv = sum( $red_in->slice(1)*$red_out->slice(0) ),
my $syw = sum( $red_in->slice(1)*$red_out->slice(1) ),
;
print my $G = mpdl
[
[ $sxx, $sxy, $sx ],
[ $sxy, $syy, $sy ],
[ $sx, $sy, $s1 ],
];
print my $Q = mpdl
[
[ $sxv, $sxw ],
[ $syv, $syw ],
[ $sv, $sw ],
];
print my $P = inv($G) x $Q;
print my $abcdef001 = ($P->transpose)->append(zeros(1))->set(2,2,1);
print my $blue_out_calc = $abcdef001 x $blue_in;
print $blue_out_calc - $blue_out_for_checking; # no more than one pixe
+l off
__END__
I am not a PDL expert, so I am sure there's a way with its linear alge
+bra to skip doing all the math.
But I enjoy the math derivation.
You have a from IN to OUT,
IN and OUT are of the form where each point #i is a column vector of t
+he form
[ x_i ]
[ y_i ]
So for each point, you can think of it as OUT = M * IN + T
OUT = [ a b ] . IN + [ c ]
[ d e ] [ f ]
But if you want to work with all of IN and OUT at once, rather than in
+dividually (embrace the power of the matrix),
you can pad it out into a single multiplication as
OUT = [ a b c ] [ x1 x2 x3 ... xN ] = [ v1 v2 v3 ... vN ]
[ d e f ] * [ y1 y2 y3 ... yN ] [ w1 w2 w3 ... wN ]
[ 0 0 1 ] [ 1 1 1 ... 1 ] [ 1 1 1 ... 1 ]
So if you want to figure out the expanded M, you want to best fit on O
+UT = M * IN (with the extra rows)
To figure out the best fit, you basically have the equations
v_i = a*Xi + b*Yi + c*1
w_i = d*Xi + e*Yi + f*1
Best fit is least-sum-squared-error, so
err_v_i = a*Xi + b*Yi + c*1 - Vi
err_w_i = d*Xi + e*Yi + f*1 - Wi
And the sq error
esq_Vi = a²*Xi² + b²*Yi² + c²*1 + Vi² + 2ab*Xi*Yi + 2ac*Xi - 2a*Xi
+*Vi + 2bc*Yi - 2b*Yi*Vi - 2c*Vi
esq_Wi = exact same form (I will just show derivation on the Vi te
+rms for now)
Sum up the total error:
SSEv = sum(a²*Xi² + b²*Yi² + c²*1 + Vi² + 2ab*Xi*Yi + 2ac*Xi - 2a*
+Xi*Vi + 2bc*Yi - 2b*Yi*Vi - 2c*Vi)
= sum(a²*Xi²) + sum(b²*Yi²) + sum(c²*1) + sum(Vi²) + sum(2ab*
+Xi*Yi) + sum(2ac*Xi) - sum(2a*Xi*Vi) + sum(2bc*Yi) - sum(2b*Yi*Vi) -
+sum(2c*Vi)
= a²*sum(Xi²) + b²*sum(Yi²) + c²*sum(1) + sum(Vi²) + 2ab*sum(
+Xi*Yi) + 2ac*sum(Xi) - 2a*sum(Xi*Vi) + 2bc*sum(Yi) - 2b*sum(Yi*Vi) -
+2c*sum(Vi)
Need to find the minimum with respect to each of a, b, c (and equivale
+ntly d,e,f) by setting each partial derivative to 0
dSSEv/da = 2a*sum(Xi²) + 0 + 0 + 0 + 2b*sum(Xi*Yi) + 2c*sum(Xi) -
+2*sum(Xi*Vi) + 0 - 0 - 0 = 0
dSSEv/db = 0 + 2b*sum(Yi²) + 0 + 0 + 2a*sum(Xi*Yi) + 0 - 0 + 2c*su
+m(Yi) - 2*sum(Yi*Vi) - 0 = 0
dSSEv/dc = 0 + 0 + 2c*sum(1) + 0 + 0 + 2a*sum(Xi) - 0 + 2b*sum(Yi)
+ - 0 - 2*sum(Vi) = 0
Divide by two and rearrange those three, grouping into a,b,c on one si
+de and coefficientless on the other
dSSEv/da: a*sum(Xi²) + b*sum(Xi*Yi) + c*sum(Xi) = sum(Xi*Vi)
dSSEv/db: a*sum(Xi*Yi) + b*sum(Yi²) + c*sum(Yi) = sum(Yi*Vi)
dSSEv/dc: a*sum(Xi) + b*sum(Yi) + c*sum(1) = sum(Vi)
But that is a matrix multiplication:
[ sum(Xi²) sum(Xi*Yi) sum(Xi) ] [ a ] = [ sum(Xi*Vi)
+ ]
[ sum(Xi*Yi) sum(Yi²) sum(Yi) ] [ b ] = [ sum(Yi*Vi)
+ ]
[ sum(Xi) sum(Yi) sum(1) ] [ c ] = [ sum(Vi)
+ ]
And similarly from the esq_Wi, you get:
[ sum(Xi²) sum(Xi*Yi) sum(Xi) ] [ d ] = [ sum(Xi*Wi)
+ ]
[ sum(Xi*Yi) sum(Yi²) sum(Yi) ] [ e ] = [ sum(Yi*Wi)
+ ]
[ sum(Xi) sum(Yi) sum(1) ] [ f ] = [ sum(Wi)
+ ]
Note that the matrix on the left is the same for both. So we can merg
+e those two into a single matrix multiplication
[ sum(Xi²) sum(Xi*Yi) sum(Xi) ] [ a d ] = [ sum(Xi*V
+i) sum(Xi*Wi) ]
[ sum(Xi*Yi) sum(Yi²) sum(Yi) ] [ b e ] = [ sum(Yi*V
+i) sum(Yi*Wi) ]
[ sum(Xi) sum(Yi) sum(1) ] [ c f ] = [ sum(Vi)
+ sum(Wi) ]
If we call those G x P = Q, then we can solve for the parameter matrix
+ P by P = inv(G) x Q
But you can get the original abcdef001 matrix by transposing P and add
+ing a row of (001)
Then to find the blue outputs, just do
BLUE_OUT = abcdef001 * BLUE_IN
Edit: when I said " # no more than one pixel off" , that was not meant as a guarantee; just that this example was only approximately 1 pixel from the supplied locations
| [reply] [d/l] [select] |
|
|
Ironically given your kind words, and as proved by some emails just today on the PDL mailing lists, I am not at all a linear-algebra expert. However, if you're trying to solve linear least-squares systems, then check out mlls (using QR or LQ factorisation) and mllss (using SVD) in PDL::LinearAlgebra. If you want to try Levenberg-Marquardt, check out PDL::Fit::Levmar.
One other observation for more idiomatic PDL usage is that if you're taking several slices (e.g. your sx, sy, s1) and doing the same to each, it'll be quicker to do something like:
my ($sx, $sy, $s1) = $red_in->slice('0:2')->mv(0,-1)->sumover->dog;
| [reply] [d/l] [select] |
|
|
| [reply] |
|
|
Weird, perl 5.30.0 with PDL 2.019 (which is the combo that shipped with Strawberry Perl's PDL edition) doesn't give that error.
These are the outputs for my run (adding a print "perl version $]";print "PDL version $PDL::VERSION"; at the beginning to print the versions):
Mathematically speaking, a [3x3] x [3x2] is the right balance of dimensions for a successful matrix multiplication (resulting in a [3x2] matrix). The use of PDL::Matrix used to be the way to make PDL match mathematical order for the dimensions, but maybe they've changed that between 2.019 and 2.057. (I don't think I can test, because the Strawberry Perl PDL setup is rather dependent on the libraries that Strawberry bundles, and I don't know if upgrading PDL will work... maybe I'll try in a sandbox to avoid ruining my "main" installation).
I tried getting rid of the use PDL::Matrix, but then that gets rid of the mpdl command, causing errors during perl's compile stage, and I'm not sure what hoops I would need to jump through to make my script compatible with both 2.019 and 2.057.
| [reply] [d/l] [select] |
|
|
|
|
|
|
|
|
|
|
Re: Abstract image registration or feature detection
by vr (Curate) on Jul 03, 2022 at 23:39 UTC
|
Task as stated looks pretty much as case for interpolation to me. One trick pony, etc., but I even found my answer (Re: which function in PDL can do the same thing as matlab pcolor?) through SS for "Delaunay", see links in last paragraph there, especially ugly formulae for barycentric weights (didn't find ready-made solution on CPAN). Code is adjusted version of what I did a few years ago for some image hacking, unrelated to feature recognition.
No idea what data pryrt used for a sample (same "red-blue" terms used here), and whether they are supposed to represent a trivial demo, or, indeed, an "interesting" non-affine distortion was applied. Anyway, results below seem to match well with his, even if it's just a hack -- X and Y of target are treated as separate functions to interpolate, with, furthermore, simple linear interpolation.
use strict;
use warnings;
use feature qw/ say /;
use POSIX qw/ round /; # use 5.022;
use List::Util qw/ sum /;
use Math::Geometry::Delaunay qw/ TRI_CONSTRAINED /;
use constant EPSILON_NEGATIVE => -1e-6;
sub key { pack 'II', @{ $_[ 0 ]}}
my @red_in = ( [58,48], [108,155], [186,80], [255,191], [331,48] );
my @red_out = ( [471,15], [531,141], [603,90], [682,227], [747,107] );
my %mapped = map { key( $red_in[ $_ ]), $red_out[ $_ ]} 0 .. $#red_in
+;
my $tri = Math::Geometry::Delaunay-> new( TRI_CONSTRAINED );
$tri-> addPoints( \@red_in );
$tri-> triangulate;
my @blue_in = ( [125,73], [197,158], [282,94] );
BLUE: for my $blue ( @blue_in ) {
TRI: for my $elem ( @{ $tri-> elements }) {
my $y23 = $elem-> [ 1 ][ 1 ] -
$elem-> [ 2 ][ 1 ];
my $x32 = $elem-> [ 2 ][ 0 ] -
$elem-> [ 1 ][ 0 ];
my $x13 = $elem-> [ 0 ][ 0 ] -
$elem-> [ 2 ][ 0 ];
my $y13 = $elem-> [ 0 ][ 1 ] -
$elem-> [ 2 ][ 1 ];
my $denominator = $y23 * $x13 + $x32 * $y13;
my $xx3 = $blue-> [ 0 ] - $elem-> [ 2 ][ 0 ];
my $yy3 = $blue-> [ 1 ] - $elem-> [ 2 ][ 1 ];
my @weights;
next TRI if EPSILON_NEGATIVE > ( $weights[ 0 ] =
( $y23 * $xx3 + $x32 * $yy3 ) / $denominator );
next TRI if EPSILON_NEGATIVE > ( $weights[ 1 ] =
( -$y13 * $xx3 + $x13 * $yy3 ) / $denominator );
next TRI if EPSILON_NEGATIVE > ( $weights[ 2 ] =
1 - $weights[ 0 ] - $weights[ 1 ]);
printf "[%3d,%3d] => [%3d,%3d]\n",
@$blue,
map {
my $coord = $_;
round sum map {
$weights[ $_ ] * $mapped{ key $elem-> [ $_ ]}[ $co
+ord ]
} 0 .. 2 # 3 vertices
} 0, 1; # x, y
next BLUE
}
die "point @$blue outside convex hull"
}
__END__
[125, 73] => [541, 63]
[197,158] => [621,174]
[282, 94] => [701,137]
| [reply] [d/l] |
|
|
| [reply] |
|
|
| [reply] |
Re: Abstract image registration or feature detection
by LanX (Saint) on Jul 01, 2022 at 15:07 UTC
|
> What I want is to determine the coordinates of these "target" points, based on the transformation determined by the corresponding "known" points.
could you please elaborate? Do you know the corresponding point pairs (known_i,target_i)?
There are many "transformations", I remember seeing a projection model based on a 4x4 matrix in uni.
Determining the entries would mean to solve the resulting linear equations.
Again it depends on the class of allowed transformations.
| [reply] |
|
|
> could you please elaborate? Do you know the corresponding point pairs (known_i,target_i)?
Each known_i and target_i point represents an unique named feature of the original image, and their locations (thus, the vectors between any two known_i and target_j point) are fixed on the reference image.
> There are many "transformations", I remember seeing a projection model based on a 4x4 matrix in uni.
The list of possible transformations include translation, rotation, skewing (so all affine transformations) plus barrel distortion, lens distortion etc. Think of a reference version of a poster in an image authoring software vs. the same poster photographed with a potato camera, not necessarily from a head-on orientation.
| [reply] [d/l] [select] |
|
|
> the vectors between any two known_i and target_j point
I'm confused. Did you answer if you know which target point belongs to which source point or if you need to determine those "vectors"?
> Think of a reference version of a poster in an image authoring software vs. the same poster photographed with a potato camera, not necessarily from a head-on orientation.
Please take this for a start Camera Matrix
It's a 3x4 matrix I remembered 4x4, so there might be more.
| [reply] |
|
|
|
|
|
|
translation, rotation, skewing (so all affine transformations)
You'll probably have more points than parameters in the affine transformation matrix, making this an overdetermined linear system. This is good: you can solve it in the least squares sense and handle a bit of the noise in the coordinates this way.
plus barrel distortion, lens distortion
This could be harder, because it's definitely not linear. Panorama stitching software like Hugin can let you optimise the radial distortion parameters at the same time as optimising the affine transform matrices; maybe you could look at its source code for some inspiration. Wikipedia lists a formula for a generalised distortion model, which you could fit the parameters for using Levenberg-Marquardt.
| [reply] |
Re: Abstract image registration or feature detection
by Anonymous Monk on Jul 01, 2022 at 15:32 UTC
|
The points themselves belong to two subsets: for the first subset, let's call them "known" points, I know the coordinates from both images
Use regression to fit the parameters of the function transforming your points from the first subset to the second subset. Depending on the form of the distorsion, it could be multivariate linear regression, nonlinear regression you would need to provide a function (and starting values) for, or some sort of data-driven approach (P-splines, PLS, support vector machines with nonlinear basis functions, XGBoost...) that doesn't need an explicit function form and uses its many degrees of freedom to fit arbitrary function shapes. The PDL module family probably contains enough Levenberg-Marquardt and linear algebra to get you started.
and for the second subset, "target" points, I know their coordinates only from the reference image.
Use the parameters obtained above to perform the transformation.
If you didn't know the corresponding points in both images, the inner part of the problem would stay the same, and the outer part of the problem would use Iterative closest point to try to estimate and update those correspondences between points.
| [reply] |
Re: Abstract image registration or feature detection
by tybalt89 (Monsignor) on Jul 02, 2022 at 00:07 UTC
|
How about providing us with a test data set or two...
( I may or may not be working or playing with something which may or may not work. (Hope i qualified that enough :) )
| [reply] |
|
|
| [reply] |
Re: Abstract image registration or feature detection
by perlfan (Parson) on Jul 03, 2022 at 07:22 UTC
|
Image::PHash looks really great for duplicate and mirrored image detection. The best I've seen so far is unfortunately not accessible via Perl. YOLO/darknet is pretty neat, but the rub comes down to creating training data. I found that "roboflow dot com" makes it about as straightforward as one can get. Problem is, it's very expensive. Maybe with enough experience with it there can be a "ready made" Perl solution. Alien::OpenCV exists but I don't think that's gonna get you what you want. YMMV. | [reply] |
Re: Abstract image registration or feature detection [UPDATED w examples]
by kikuchiyo (Hermit) on Jul 05, 2022 at 16:07 UTC
|
It turns out that PDL has a set of routines for precisely this task: fitwarp2d, applywarp2d.
The results are not that great, but it works out of the box.
| [reply] [d/l] |
|
|
| [reply] |
|
|
How are you measuring "results are not that great"
Not very scientifically, I've just looked at the largest difference in the coordinates.
I've also plotted the results: https://imgur.com/a/W7j3yC9
what are the results against the two test cases you have posted
target_41:
339 508 345.92214 512.55747
314 518 318.48381 516.9291
265 533 274.68006 530.37545
245 537 248.82152 537.19123
228 544 227.20604 544.26418
target_996:
207 550 213.15752 551.70605
206 569 209.79236 571.16027
205 602 208.69494 603.29318
203 621 207.30388 622.07696
201 638 207.08208 637.97918
Perhaps "not that great" was not that great a description; this algorithm in fact produces comparable results to those in other replies in the thread.
Note that I've messed up the direction of the transformation in my reply, the lines at the end should be my ($px, $py) = fitwarp2d($red_out_x, $red_out_y, $red_in_x, $red_in_y
+, 2);
my ($blue_new_x, $blue_new_y) = applywarp2d($px, $py, $blue_in_x, $blu
+e_in_y);
#say for $blue_out_x, $blue_out_y, $blue_new_x, $blue_new_y, $blue_out
+_x - $blue_new_x, $blue_out_y - $blue_new_y;
| [reply] [d/l] [select] |