Ahoy, ye Monks.

I've recently had a few needs to take a text file full of data, usually a server log file, and analyse it in some way. It's pretty easy, if I know what I'm looking for, to write a script to tell me, say, how many times page X was loaded during the month of July. What's less obvious is what to do with the data when I don't know what I'm looking for yet. I'm looking for patterns, but I don't know what they are.

Right now, I've got two theoretically identical DHCP servers, except one of them is getting 1/2 the traffic of the other, which doesn't make sense. I want to analyse my logs and see if I can figure out a pattern. Maybe the one with 1/2 the traffic is getting no requests from computers in a particular subnet? Maybe it's only getting a certain type of request? What time of day has the most requests?

Basically, I'm trying to figure out what form to put my data into in order to ask any question I want.

I'm thinking the best way to handle this is to load all the data into a SQL DB and then run SQL queries at it to ask it the questions I come up with.

So, the question I'm really trying to get to is: what's a good strategy when you know you want to analyze some data, but you don't know specifically what you're going to look for? If I'm right that the first step should involve stuffing the data into a SQL database, are there genetic modules to help me do this? Or am I totally missing the boat and there's better ways to handle this. Or maybe I'm trying to be too sophisticated and the most efficient thing to do is change the code to ask and answer a different question each time?

I hope that made some sense....

--Pileofrogs


In reply to Arbitrary Analysis? by pileofrogs

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.