in reply to Re^3: Any idea for predicting the peak points in the graph by perl
in thread Any idea for predicting the peak points in the graph by perl

I was searching for a pre-made perl solution for performing peak-detection when I stumbled upon this post.

To anyone following in my footsteps know that the OP incorrectly represents information regarding the meaning of derivative and second derivatives.

Specifically the second derivative is not required nor necessarily useful in assisting with peak detection.

Second derivatives represent change to the rate of change of a series (in this context). Peak detection has little to nothing to do with such change to the rate of change (aka acceleration). Instead we are more interested in the first derivative, which is more analogous to rate of change of the originating series (aka velocity).

While performing any peak detection your goal is to capture the points in a series whereby the point has reached a local maximum or minimum. One can guesstimate that if velocity is not changing, and then subsequently reverses (changes sign) that this would likely represent a peak of some type. Further analysis using the series mean and standard deviation can help determine if this is in fact a meaningful peak to you or not (it's all circumstantial depending on your needs).

So the OP's suggestion that observing changing signs on the second derivative is not correct. Additionally while observing sign changes on the first derivative may prove useful, you will need to perform additional analysis to determine if it's the exact data you are looking for.

  • Comment on Re^4: Any idea for predicting the peak points in the graph by perl

Replies are listed 'Best First'.
Re^5: Any idea for predicting the peak points in the graph by perl
by BrowserUk (Patriarch) on Jul 15, 2012 at 06:41 UTC

    And for those that follow you. Neither the first nor second derivative is a particularly good indicator of peaks and troughs. At least not in this data.

    In this graph, red is the original data; green the first derivative and blue the second.

    Note that the green lines inflection points don't track the reds in anyway.

    The second derivative has inflections corresponding to those in the data. But it also "invents" new ones where they do not exist.

    And can anyone automate the determination of which of the 5 or 6 reversals in the last segment of the blue line corresponds to the highest peak in the original data?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      Not exactly sure where to start here:
      1. The graph you provided is upside down. Please provide your spreadsheet that you used to generate it, I am very confused how you could even make that happen. You'd have to have multiplied all y values with a -1 to cause such a thing.

        I have provided a correctly oriented graph, along with the first derivative here: Correct Graph
        Blue is the source data
        Orange is the first derivative

      2. The 3 curves presented here are not to scale and lack axes. As such they misrepresent the meaningfulness of the first derivative in peak detection.

      3. Your conclusion that upside down derivatives are not useful in peak detection is in fact correct. I assure you that a correctly oriented first derivative is useful for this type of problem.

      4. Even though your chart is upside down, your first and second derivatives seem to be calculated correctly (for the upside down data). If you redraw it to scale with axes you will notice that any time the first derivative crosses the x axis (aka it is zero) there will be a peak. This is a very useful property of the first derivative for such peak detection.

      With a very small perl program (less than 200 lines) I have a peak detector that works pretty well for this type of psuedo-sinusoidal data. It doesn't take much to do this kind of detection once you have the first derivative calculated and saved.

      Given that - as with all solutions, it must fit the problem, and my data is probably a lot different than other people's data (although it's pretty close to the OPs)

        1. The graph you provided is upside down

          That's irrelevant. It does not affect the veracity of the plot.

          But as it seems to bother you; I've posted the same graph flipped vertically (with a couple of additions) below.

        2. Correct Graph

          Your link leads to a page that reads: This image or video is currently unavailable.

        3. Please provide your spreadsheet that you used to generate it,

          I didn't use a spreadsheet. This is a Perl site. I used Perl :)

        4. I am very confused how you could even make that happen. You'd have to have multiplied all y values with a -1 to cause such a thing.

          Or; plot the data on a medium that has the origin top left with Y running top to bottom. As is the convention with most computer graphics.

        5. The 3 curves presented here are not to scale and lack axes. As such they misrepresent the meaningfulness of the first derivative in peak detection.

          Since all the Y values on the three curves are related numerically, their absolute values are irrelevant. Hence scales are superfluous.

          The X values are the same, drawn to the same scale and offset for all three plots.

        6. Your conclusion that upside down derivatives are not useful in peak detection is in fact correct. I assure you that a correctly oriented first derivative is useful for this type of problem.

          Sorry, but that makes no sense at all. Since the important points are where the curves transition across the Y=0 line. Whether that transition occurs going from above to below or below to above doesn't change anything one iota.

        7. If you redraw it to scale with axes you will notice that any time the first derivative crosses the x axis (aka it is zero) there will be a peak.

          If that were the case, I would not have posted. Take another, closer look at the graph.

          The additional black horizontal line is the x-axis (Y=0) of both the 1st (green) and 2nd (blue) derivative plots. The additional black vertical lines mark the peaks and troughs in the data plot (red).

          Note how the 1st derivative plot (green) doesn't transition 0 at all for the first two turning points, and is substantially inaccurate for turning points four and six; slightly inaccurate for the fifth; leaving just 2: the third and seventh that it hits accurately.

          You may now be wondering how the black vertical lines were drawn.

          If you subtract consecutive data points and compare the results to 0:

          my @deltas = map{ ( 0 <=> $y[$_-1] - $y[$_] ) } 1 .. $#y;

          You get a dataset like this:

          -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 1 1 1 1 1 -1 -1 -1 -1 1 1 +1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 1 1 1 1

          You then plot a vertical line at the X value everwhere the sign changes:

          $deltas[$_-1] != $deltas[$_] and $im->line( $x[$_], 0, $x[$_], 800, 0 +) for 1 .. $#x-1;

          And there you have your maxima and minima simply, directly, and accurately.

        With a very small perl program (less than 200 lines) I have a peak detector that works pretty well for this type of psuedo-sinusoidal data.

        So verbose!? :)

        Here is my Perl code, a whole 40 lines, that plots the above linked graph:

        #! perl -slw use strict; use Data::Dump qw[ pp ]; use GD; use constant { WHITE => unpack( 'N', pack 'CCCC', 0, 255, 255, 255 ), RED => unpack( 'N', pack 'CCCC', 0, 255, 0, 0 ), GREEN => unpack( 'N', pack 'CCCC', 0, 0, 255, 0 ), BLUE => unpack( 'N', pack 'CCCC', 0, 0, 0, 255 ), }; my( @x, @y, @yd1, @yd2 ); ( $x[@x], $y[@y], $yd1[@yd1], $yd2[@yd2] ) = map{ $_ //= 0 } split whi +le <DATA>; chomp @yd2; $_ = ( $_ -4 ) * 1000 for @x; $_ /= 6 for @y; $_ = $_ / 320 + 400 for @yd1; $_ = $_ / 8000 + 400 for @yd2; my $im = GD::Image->new( 1000, 800, 1 ); $im->filledRectangle( 0, 0, 1000, 800, WHITE ); $im->line( 0, 400, 1000, 400, 0 ); $im->line( $x[$_-1], $y[$_-1], $x[$_], $y[$_], RED ) for 1 .. $#x; $im->line( $x[$_-1], $yd1[$_-1], $x[$_], $yd1[$_], GREEN ) for 2 .. $# +x-1; $im->line( $x[$_-1], $yd2[$_-1], $x[$_], $yd2[$_], BLUE ) for 3 .. $#x +-2; my @deltas = map{ ( 0 <=> $y[$_-1] - $y[$_] ) } 1 .. $#y; $deltas[$_-1] != $deltas[$_] and $im->line( $x[$_], 0, $x[$_], 800, 0 +) for 1 .. $#x-1; $im->flipVertical; open PNG, '>:raw', "$0.png" or die $!; print PNG $im->png; close PNG; system 1, "$0.png";

        And the dataset taken directly from salva's post above:


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?