Hi,

Be warned in advance, I have no education in statistics and might be using the wrong terminology... :)

I store system performance metrics as a (epoch) time-series in a SQLite database which may have missing data-points at a given time. It could be simplified like this:

Time Val1 Val2 Val3 08:00 11 70 0 08:20 22 10 08:40 56 80 25

I'm using GD::Graph to plot each individual data-set, and in some cases along with the summed values in a line graph. The problem is that just feeding the Val2 series as-is to GD will result in a shorter line graph over time than the Val1 series, with the actual values not corresponding to the correct time on the x-axis.

The way I have it now is to do a complex SELECT to get values for just the common time points and feed that to GD, but it is quite inaccurate and it is slow due to the number of data-points involved. So some sort of mathematical transformation could probably work better (pity I know very little about them).

I'm guessing I need to normalize each data-set across the overall time-range, and do a linear interpolation for each missing value, but the normalization is a bit beyond me.

It seems Data::TimeSeries could do something like this, but my data times could be a granular as a few minutes and this module only seems to support HOURS as a period.

I've read a bit about RRDTool and it sounds like it might be a great alternative to using a database altogether especially in reducing disk space usage, but to rewrite my code seems a bit more involved than what I prefer right now.

Does anyone perhaps know of another fairly efficient way to normalise data like this?

Regards,
Niel


In reply to Time series normalization by 0xbeef

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.