I like the idea of a low pass filter, though that is not sufficient to solve this problem. But with a data set of this kind, a Fourier transform is a particularly poor choice. The reason is that the Fourier transform tries to approximate the data as a set of sine waves. Therefore if you have a jump anywhere in the data, then applying a Fourier transform, removing high order terms, and transforming back will leave you with severe visible "ringing" over the whole data set, which will obscure what he's looking for.

That's why I naturally jumped to wavelets, which localize the higher order terms so that removing noise in one part of the picture doesn't create it elsewhere. They do that by combining low-pass and high-pass filters to separate data into localized high frequency pieces, and spread out low frequency pieces.

The easiest kind of wavelet to use is the Haar wavelet, which is likely also most appropriate for this case. The Haar wavelet is just a bunch of square waves. If the number of points is a power of 2, then you'd calculate the Haar wavelet as follows. Take each pair of points. Calculate their average and their difference from the average. The differences are the highest frequency wavelets, so save them. You have half as many averages as you have points. Repeat the calculation on the averages to get the next smaller wavelets. Then the next smaller wavelets, etc. Finally you'll be left with an overall average.

This procedure works, and will work well with data sets with large piecewise horizontal chunks. However as described it will not find boundaries well that are not conveniently on a power of 2. Nor will it work well if the width of the images is not a power of 2.

Wavelet packet algorithms generalize this procedure by dynamically altering the details of wavelet shape to come up with a solution that optimizes some global goal. For instance you could vary the boundary of the square waves to concentrate as much energy as possible into the smallest number of waves possible. If you do that then the boundaries of the largest few wavelets will naturally fall on the intervals that they're trying to find, and the intervals will be recognizable by the interior pieces being very small.

I remember that much of what they do, but I don't remember how to dynamically calculate wavelet packets. :-(


In reply to Re^2: How to Determine Stable Y-Values... or Detect an Edge..? by tilly
in thread How to Determine Stable Y-Values... or Detect an Edge..? by ozboomer

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.