glwtta has asked for the wisdom of the Perl Monks concerning the following question:

Guidance needed in all things image manipulation,

I have a jpg image of a microarray slide. In other words, roughly a 2x4.5K pixel image with rows of colored spots on it. Here's a link to an example, if you've never seen one; it's not quite what I am working with, but the idea is the same.

What I need to do is, given a set of coordinates (the spots are arranged into blocks and the blocks have rows and columns), get that one spot as a 16x16 pixel image.

Simple, right? Except that the slides are never quite aligned properly, and the width tends to differ by a few pixels. So in other words, the spot at 1,1 may be 5 pixels away from the left edge, and the last spot in the same column maybe 10 pixels farther away. They have one feature to help with this, in each corner there is a few spots (the number identifies which corner it is, but the corner spot is there for all four) of controls, meaning they always come up as bright green.

So, is there a module out there that will help me crop and rotate the image so that each of the four corner spots has two sides excactly at the edges of the image, and all the spots along each side are precisely aligned?

My graphics manipulation experience with perl is extremely limited, I use GD.pm quite a lot but only to create simple charts and such, I don't know where to begin here.

So this is really a two part question: first, what do I do to grab a specific region of a larger image? I can always just grab a somewhat larger are so I get the spot I am looking for in the middle and a bit offcenter. And two, what, if anything, can I use to do the above-described alignment. At this point I am more curious about how to do The Right Thing, if it's more trouble than it's worth I'll just do the more sloppy method described above.

Thanks,

Replies are listed 'Best First'.
Re: image clean up and alignment
by BrowserUk (Patriarch) on Apr 16, 2003 at 04:07 UTC

    There are several potential problems both with what you are trying to do, and with trying to do it with perl.

    The first is that you mentioned that the slides are in .JPG form (although the linked example was a .GIF). Unless the .JPG was saved without compression, you have already either lost information, or muddied the information that is there, as .JPG compression is 'lossy'. This means that some of the information present in the original data has been discarded in order to reduce the range of values present and aid the compression ratio. This shows up in a close inspection of the sample slide in several ways, the most fundemental of which is that the borders around the spots are not a single consistant colour, but instead contain blotches of different shades. This presents a fundemental problem with edge detection. If you then attempt to apply image rotation to correct the skew, you are going to further degrade the integrity of the image.

    Assuming that the spots are 16 pixels square, the inter-spot borders are 5 pixels, and the outer frame and inter-block borders are 10 pixels. If the slides are two blocks deep, and there are 20x20 spots/block as in the linked example, if the run-out from top to bottom was 10 pixels, then the angle of rotation required to correct this would be less than 1 degree! If one edge is longer than two blocks, then the angle gets progresively smaller, and the problem that much harder. Apart from that I do know of any low-end image manipulation libraries or packages that will perform rotations of partial degrees, the process of rotation would further distort the edges between the borders and the spots. If the slides are available in a non-lossy or uncompressed form, then they would make a much better starting point, but even then, using the generic image manipulation of something like ImageMagik is likely to result in considerable loss of information.

    If you could say that the frame was definitively black and that spots were not black, then scanning first horizontally to determine the difference in the width of the left-hand border, doing a little math to determine how many pixels to pad or trim that edge of each row is reasonable trivial. You then apply the same technique processing the image vertically again triming or padding one edge to realign things vertically, and you should correct the skew. From what I saw of the linked slide, the problem is that there is no diffinitive color for the borders as I mentioned earlier, so edge detecton then becomes a process of determining a threshold. Anything below this value is black and therefore border, anything above is color and therefore spot. Again, looking at the linked frame, some spots have no color at all and so are indistinguishable from border. In some places there are blotches in the border that are brighter than the center of some of the spots. You might be tempted to try and use contrast enhancement or spot removal to clean up the borders, but until you have detected them, you can only apply the algorithms to the whole slide and thereby affect the color of the spots as well. I assume that this would fundementally affect the nature of the expressions you are trying to detect and categorise?

    I guess the upshot of what I am saying is that whilst it would be possible to manually isolate the border using threshold filters, construct a mask of the borders, apply this back to the original image and the extract the spots using general purpose photoimaging techniques, trying to automate the process using those techniques is going to be extremely difficult--requiring many passes and stepwise refinements applied to each image--if not impossible.

    To stand any realistic chances of automating this without altering the nature of the data itself would require the use (or construction) of library of much lower level, and more highly tailored filters than are generally available in photoimaging libraries and packages.

    It would also require specific knowledge of the nature of the information that you need to derive from the spots. Eg. It would be somewhat simpler if you only needed to determine the absolute color of the 'brightest' pixel in each spot than if you need to determine an average (mean or median) of each spot or the relative density of the colors in the spots?

    The other problem with trying to do this using perl, unless you can find/obtain an existing library with the required facilities that is written in C, FORTRAN or similar that has a Perl callable interface, is that processing large, 2-dimensional arrays of numeric data is just about the weakest aspect of Perl. The very nature of perls dynamic array structures actively works against manipulating data that is essentially static in nature. You can drop into Inline::C or XS, but unless you manipulate the data in packed scalars (or blocks of memory allocated at the C-level)--in which case you would probably be better off using C for the entire proces--all the pointer chasing that make Perls arrays such a joy for most uses completely work against you in this case.

    If no other monks come along with better options than those I've mentioned and I haven't completely put you off, I'd love to see the answers to the questions I've posed above. I did do some playing around with writing a module to allow direct manipulation of packed image data (in .BMP format) which I would gladly pass along if you think it would be helpful. It doesn't go very far but it might help.

    Good luck.


    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.
      First off, thank you for the very extensive answer, your help is much appreciated.

      Now to clear up a few things. First, the sample image was only provided for those who have never seen a microarray slide to better understand what I mean by "slide", "spot", etc. the images I am working with are indeed jpegs are much larger and actually quite a bit cleaner (I wish I had a place to upload an actual sample... I'll see if I can find a way).

      Most importantly - I am not doing any sort of analysis on the spots themselves. The analysis is done on the original TIFF files (which are much too large to reasonably store for this application, which is why I am using the jpegs, I am not sure how compressed they are, but they are still of very high quality) using software written by many people much smater than I over the course of many years :) All I need is to get the spot given the coordinates - it's just another visual clue for the user in the final report, so the spot colour doesn't need to be anywhere near as precise as what is used for the actual analysis.

      It has not in fact occured to me that much easier than rotating the actual image would be to determine the distances to the border for each row/column and adjust the cropping accordingly, most likely this will be what I end up doing. In fact just doing this for each of the four cornet spots, which have a very high contrast with the background just for this purpose, should be sufficient to figure out the rest of them.

      Incidentally, speed and memory requirements are not an issue at all here - I would most likely to this once on the image to pick out all the spots and store them individually, and we are only talking about a few hundred of these over the course of several years.

Re: image clean up and alignment
by halley (Prior) on Apr 16, 2003 at 02:43 UTC

    Is there a module to do it? Not that I know of.

    Looking for tips on how to do it?

    Once you've loaded the image into memory (see PerlMagick or GD or some other image-buffer module), it's time to start doing some math. The PDL may help here, but I haven't used it much yet. These are only general advice; I've done exactly this sort of image processing in a past life, but not in Perl and not in years.

    • Don't use resampling methods which will blur your data. If you have to skew, move each row or column by full pixel distances. If you have to stretch, favor stretching bigger, and only by full pixel distances (i.e., if you had 50 black then 50 white pixels, if you need to stretch out three pixels, it should end up 52 black then 51 white, and never introduce or "invent" new shades of gray).
    • Decide on a tolerance, where pixels dimmer than X are considered "black" for the purposes of image alignment.
    • One approach to vertical alignment would be to find the centroid of the top row of cells, and the left row of cells, then skew the image in each direction to square them up. There are a couple approaches to finding the centroids, depending on just how far out of whack your samples might be originally.
    • If your samples aren't squared up (the bottom row might be significantly longer or shorter than the top row), then you'll have to correct for this as well. Skew the image for the first two axes, then stretch rows to form a constant height, then stretch columns to form a constant width.
    • Lastly, once the image is rectangular, it should be easy to scan and trim any excess pixels around the border.

    --
    [ e d @ h a l l e y . c c ]

Re: image clean up and alignment
by toma (Vicar) on Apr 16, 2003 at 04:18 UTC
    You might try PDL::Image2D or Tk::PhotoRotate. They both claim to rotate images by arbitrary angles. The PDL module can also crop regions of your image.

    I think you'll find that PDL is a good thing to learn for your application. Many of its routines are based on fast C libraries. PDL is intended for the type of work that you are doing.

    It should work perfectly the first time! - toma

Re: image clean up and alignment
by Improv (Pilgrim) on Apr 16, 2003 at 01:44 UTC
    Hey, I don't have any advice on the bigger problem, but for image manipulation, PerlMagick isn't a bad choice. It provides a decent API to crop and rotate, along with lots of other good stuff. I suspect there's no particularly elegant solution to the second part, and that you'll just need to do a lot of custom code. Perhaps the other monks will prove me wrong :) I hope this helps!
      Image::Magick was also my first idea, but after more thought I doubt if it will be sufficient. I recently had a program that needed to read individual pixel color values from an image. From what I understand about Image::Magick, the underlying libraries support such functionality, however the Perl API for those libraries does not.

      I'm pretty sure that we'd have to be able to read pixel values for this problem, so perlMagick won't work. I'd love for someone to prove me wrong though!

      -caedes

Re: image clean up and alignment
by jonadab (Parson) on Apr 16, 2003 at 11:45 UTC

    Haven't done image manipulation in Perl myself, but I was thinking, does the Perl interface to the Gimp give you access to the magic wand tool? If so, then you probably just need to hit anywhere within the spot.


    for(unpack("C*",'GGGG?GGGG?O__\?WccW?{GCw?Wcc{?Wcc~?Wcc{?~cc' .'W?')){$j=$_-63;++$a;for$p(0..7){$h[$p][$a]=$j%2;$j/=2}}for$ p(0..7){for$a(1..45){$_=($h[$p-1][$a])?'#':' ';print}print$/}
Re: image clean up and alignment
by feloniousMonk (Pilgrim) on Apr 16, 2003 at 15:14 UTC
    Maybe you should check over at bioperl.org.

    But just an FYI - I work in a very Perl-heavy bioinformatics lab and have yet to see a perl solution for microarray image reading. Data processing, sure, but not from the level of the images.

    -felonious
      I haven't seen anything relevant from the bioperl folks (and let's face it, I spend half my time in bioperl code). Keep in mind that while the data is very bioinformatics specific, what I am trying to do with it is not in the slightest - I am not trying to read or analyze the spots, just crop a specified one, more or less precisely.
Re: image clean up and alignment
by Anonymous Monk on Apr 17, 2003 at 05:21 UTC
    Others have pointed out that once you determine the slide's skew, you may be better off having the spot grabber compensate, rather than trying to rotate the entire image.

    One option for accessing the pixels which no-one has mentioned is simply to use substr and unpack. Convert the jpg to ppm, which is just a row-major sequence of pixels, 3 bytes (rgb) per pixel. Just slurp it in as a string. The arithmetic for converting from pixel coordinates to string offset is trivial. Substr 3 bytes and unpack. Two lines of code. It is quite "fast" (in perl, rather than C terms). I wouldn't suggest touching every pixel with it, but it's great for sampling. And to grab the hypothetical 16x16 pixel spot, one can just concatenate 16 substr's, slap "P6 16 16 255 " on the front, and you have the spot ppm image. And with Inline::C, it is not difficult to convert parts of this to C, should the need perhaps someday arise.

    Perl is quite good at doing _simple_ things with images. It is only when you get slightly more complex that you can get bogged down in the zoo of partial solutions.