Hi everybody.... my first post...

Well, having no idea of perl, I gave it a shot, but I'm obviously missing it, so is time to ask for help on the comunity.

Actually I'm trying to get AWSTATS to better work with shoutcast w3c logs, so it can squeeze al juice from it. Actually somebody on the shoutcast comunity created a perl script to parse shout w3c log files so it can be awstats readable, but it still misses a key part of info: the player used.

So, I have a w3c file, perl correctly parsed, but I need to further parsing it:
The 7th word/string of each line on the log file, is the player.
Problem: the string is mostly a ugly, long, barely useable word.
Target: replace it to a useable one, looking for some popular players, and, for the rest of cases replace the string to simply OTHER or UNKNOWN
The approach: this 7th word starts always with the key letters (for instance, for VLC player, we read vlc%2F1%2E1%2E5, while an iTunes entry yelds iTunes%2F10%2E3%2E1%20%28Macintosh%3B%20Intel%20Mac%20OS%20X%2010%2E6%2E7%29%20AppleWebKit%2F533%2E21%2E1...and so on) Since the amount of available players is overkill, the idea is create some logic to compare this first occurences at the start of the string with some limited number of matchings, secuencially, replacing as necessary, and if finally none is matched, replace whatever string with the final one (unknown, other, or so...)

Here a sample of how a rew log line reads
95.61.50.98 95.61.50.98 2011-07-14 16:04:17 /stream?title=Unknown 200 vlc%2F1%2E1%2E5 19261930 1193 129160 Here hot it reads after correct awstats useable parsing
95.61.50.98 95.61.50.98 2011-07-14 16:04:17 /stream?title=Unknown 200 vlc%2F1%2E1%2E5 19261930 1193 129160 GET And here how it should read (for instance)
/stream?title=Unknown 200 VLC 19261930 1193 129160 GET

And finally the sc_parse.pl script (all credits to its author!!!!) that does the trick, with some of my nonworking mods comented out:

#!/usr/bin/perl -w # -*- cperl - # # Parse ShoutCast v1.9.8 W3C log and append "GET" to each log line to +pretend it's a web logfile # # Usage: perl sc_parse.pl -c /full/path/to/shoutcast/sc_w3c.log # # Written by Ryan Gehrig # use Getopt::Std; our $opt_c; getopts('c:'); # Open the log file if (-e $opt_c) { print "Parsing log '$opt_c' ...\n"; open FILE, "$opt_c" or die "ERROR: Failed to open log file\n"; my @lines = <FILE>; foreach(@lines) { # Ignore comments if("$_" !~ /^\#/) { # Lose newlines $_ =~ s/\n//; #### Here I start editing/adding # First approach: # This works, but I cant add such a line for # every known and future player! there should be a way # to get the 7th word replaced anyway: # Look for flashplayer #$_ =~ s/MPEG\%20OVERRIDE/FlashPlayer/; # More realistic approach: # Obviously it fails for an unexperienced programmer as me +: # 7th string are players, focus on it #my @words = split(' ', $_); #if(@words[6] =~ m/MPEG\%20OVERRIDE/){ # @words[6] = "FlashPlayer"; #} #elsif(@words[6] =~ m/^vlc/){ # @words[6] = "VLC"; #} #I could add more occurences here #else { # @words[6] = "other"; #} #### Here ends my editing/adding # Write this line to new log file "sc_w3c.log_x" open (MYFILE, '>>'.$opt_c.'_x'); print MYFILE "$_" . ' GET' . "\n"; close (MYFILE); } } close(FILE); } else { print "ERROR: The specified log file doesnt exist. Exiting.\n"; }

Should I have it working... and if I manage AWSTATS to use the resulted log as intended, this would be incredible!!!


In reply to a perl, awstats and SHOUTCast history by alexolivan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.