I'm on a very simplified pattern recognition quest!

First I thought i would post here, I'm not realy expecting any perfect answers, but I'm looking to see if someone else is interested in the same problem, has solutions, has links to information about solutions etc.

Basic problem, find the sub-image-map in the image-map. The largest scaled version of the sub-image-map should be the final answer.

Image-Map 0000000000000000000000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000011111111111110000 0000011111111111110000 0000011111111111110000 0000011111111111110000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000000000000000000000
Sub-Image-Map 010 111 010

A couple ideas I've considered and will try to implement include, breaking the image-map down further into 2x2 blocks, 3x3, 4x4, etc and compare each of those grids to the sub-image-map. This definitly counts as a brute force, and would require tons of passes to get any results. So then my second idea is to start with the sub-image-map, and slowly increase its size (scanning over the image-map each time) until i get a match. This has the same problems as the first idea except that you don't loose any detail in the original so matches that wouldn't fall neatly into 4x4 blocks (or 2x2,3x3,etc) could still be matched. Both of these seem brutish and I'm sure others out there have some ideas how to match them in a better fashion.

So how did I get to this quest? I want to catalog my existing collection of pictures. Yes I know its hard, maybe even impossible to do well. My goal is to learn and maybe explore some non-traditional methods ;) So my first idea is to let the user highlight "significant" portions of an image and tag them with a name. For instance I might outline my face and tag it "eric". Then I want it to store that tag and scan all other images for that sub-image (for lack of a better term.) Now of course a straight compare wont work well because of variations in the photo's size, color and background. So the first step would be to normalize the photo's down to contain less information. A couple ideas for that include histograms, density maps, edge detection etc. Once I have normalized the images on file, and the sub-image, then its STILL a matter of searching for the sub-image-map in the whole list of image maps. Which brings us to my post ;) Given the following image-map and sub-image-map can anyone devise an algorithm that isn't brute force? The examples here are obviously very simple and no where near realistic, however you have to start somewhere, and the bottum easiest case is the way to start.


___________
Eric Hodges

Replies are listed 'Best First'.
Re: Pattern Recognition Quest
by blokhead (Monsignor) on Jan 04, 2006 at 23:52 UTC
    If you want more advanced pattern recognition, this is not it. But here is a really basic concept that will handle all simple rectangular shapes, including your example.
    • "squish" all consecutive identical rows into only one row.
    • Do the same for columns
    In code, it looks like this:
    my $IMAGE = <<END: 0000000000000000000000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000011111111111110000 0000011111111111110000 0000011111111111110000 0000011111111111110000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000000000000000000000 END sub transpose { my @matrix = map [ split // ], split /\n/, shift; my $dim = $#{ $matrix[0] }; @matrix = map { my $x=$_; [ map { $matrix[$_][$x] } 0..$#matrix ] +} 0..$dim; return join "", map { join("", @$_) . "\n" } @matrix; } sub squish { my $str = shift; $str =~ s/^(.*\n)\1+/$1/gm; return $str; } print transpose squish transpose squish $IMAGE; __OUTPUT__ 00000 00100 01110 00100 00000
    I just threw this together, you could certainly make it more elegant, and resilient to the newlines in the data. But hopefully you get the idea. I chose to represent the image matrix as just a flat string because it makes squish simpler, but transpose would be a little more elegant with a 2-dimensional array instead of a string.

    To match the exact output you wanted, you would also have to trim all leading and trailing trailing all-zero rows & columns.

    blokhead

Re: Pattern Recognition Quest
by diotalevi (Canon) on Jan 05, 2006 at 00:15 UTC

    This doesn't answer your question at all but it's still kind of neat. It transforms your example input into the example output by removing repeated lines, vertically and horizontally. It also removes "empty" lines from the edges.

    $image = <<IMAGE; 0000000000000000000000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000011111111111110000 0000011111111111110000 0000011111111111110000 0000011111111111110000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000000001111100000000 0000000000000000000000 IMAGE $result = transpose( trim( vertical_squish( transpose( trim( vertical_squish( $image ) ) ) ) ) ); $result eq <<EXPECT or die "Failed.\n$result\n"; 010 111 010 EXPECT sub trim { local $_ = shift; s/\A(?:^0+[\r\n]+)+//m; s/(?:^0+[\r\n]+)+\z//m; return $_; } sub vertical_squish { local $_ = shift; s/(^[^\r\n]+[\r\n]+)\1+/$1/gm; return $_; } use Algorithm::Loops 'MapCarU'; sub transpose { local $_ = shift; return join( "", &MapCarU( sub { join( "", @_ ) . "\n" }, map( [ split // ], split( /[\r\n]+/ ) ) ) ); }

    ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re: Pattern Recognition Quest
by john_oshea (Priest) on Jan 04, 2006 at 22:15 UTC

    Found via Googling "fractal image comparison" - a paper from the Journal of Information Science and Engineering:

    ... However, when users obtain an image from a newspaper or magazine, they want to retrieve related information about this type of image through a search engine, traditional search engines are incapable of meeting their needs. To help solve this problem, this paper provides an Image Comparison Search Engine (ICSE), and makes use of ˇ§Fractal Image Processingˇ¨ to create a database using image eigenvalues. When users input an image query, this system will generate image eigenvalue data, compare this with the data in the database of image eigenvale, and output the results. Based on our experimental results, ICSE can not only find the exact input image for the source image, but also find the ˇ§right imageˇ¨ when the source image is rotated and misty, and contains impurities...

    Disclaimer: this is so not my field, so I've no idea how good / feasible this may be for you.

Re: Pattern Recognition Quest
by swampyankee (Parson) on Jan 04, 2006 at 21:55 UTC

    There's quite a lot of technical literature on pattern recognition. (a searh for "pattern recognition" on ACM's website gave me about 4500 hits). I know you can google at least as well as I can, so I'm not going to just give you some random links.

    I think that MIT's AI lab has done quite a bit of work on pattern recognition. Looking there would probably be a good place to start (and some of their papers are probably classics).

    It may be interesting (and challenging) to try to detect steganography in images, too.

    emc

    " When in doubt, use brute force." — Ken Thompson

      Yes I can google. There is an abundant amout of information and I should have mentioned that. I was more wondering if someone else was interested in this, and perhaps if anyone had some "real world" expereince/ideas to share. I did google for manything but often google is bane in that it returns tons of useless stuff too. BTW what is ACM? Anf if you've done anything with pattern recognition do you have specific links that are good? I'm off to try and find MIT AI papers now. Thanks.


      ___________
      Eric Hodges
Re: Pattern Recognition Quest
by educated_foo (Vicar) on Jan 05, 2006 at 09:07 UTC
    I don't have time to code this up, but I would probably approach it by first writing a routine to scan the small image across the large one looking for matches. I would then incrementally scale down the large image and match against the template. That way you don't have to worry about round-off errors that would come from scaling the small image up.
Re: Pattern Recognition Quest
by smokemachine (Hermit) on Jan 05, 2006 at 18:43 UTC
    I made a Neural Net module (not documentaded yet) whose recognited with success letters like:
    01110 10001 10001 11111 10001 10001
    whith this code, i recognited all vogals, in 3 difrents forms... just trainig 1... sorry about my english... and if you want use the module, just ask...
Re: Pattern Recognition Quest
by bageler (Hermit) on Jan 05, 2006 at 17:42 UTC
    I've worked on some similar things, except in the context of ribeye steaks. A GA was used to learn what a ribeye generally looks like. Of course, this took several hundred pictures to train the GA to begin with, so I'm not sure how well it would work for you.