alexolivan has asked for the wisdom of the Perl Monks concerning the following question:
Hi everybody.... my first post...
Well, having no idea of perl, I gave it a shot, but I'm obviously missing it, so is time to ask for help on the comunity.
Actually I'm trying to get AWSTATS to better work with shoutcast w3c logs, so it can squeeze al juice from it. Actually somebody on the shoutcast comunity created a perl script to parse shout w3c log files so it can be awstats readable, but it still misses a key part of info: the player used.
So, I have a w3c file, perl correctly parsed, but I need to further parsing it:
The 7th word/string of each line on the log file, is the player.
Problem: the string is mostly a ugly, long, barely useable word.
Target: replace it to a useable one, looking for some popular players, and, for the rest of cases replace the string to simply OTHER or UNKNOWN
The approach: this 7th word starts always with the key letters (for instance, for VLC player, we read vlc%2F1%2E1%2E5, while an iTunes entry yelds iTunes%2F10%2E3%2E1%20%28Macintosh%3B%20Intel%20Mac%20OS%20X%2010%2E6%2E7%29%20AppleWebKit%2F533%2E21%2E1...and so on) Since the amount of available players is overkill, the idea is create some logic to compare this first occurences at the start of the string with some limited number of matchings, secuencially, replacing as necessary, and if finally none is matched, replace whatever string with the final one (unknown, other, or so...)
Here a sample of how a rew log line reads
95.61.50.98 95.61.50.98 2011-07-14 16:04:17 /stream?title=Unknown 200 vlc%2F1%2E1%2E5 19261930 1193 129160
Here hot it reads after correct awstats useable parsing
95.61.50.98 95.61.50.98 2011-07-14 16:04:17 /stream?title=Unknown 200 vlc%2F1%2E1%2E5 19261930 1193 129160 GET
And here how it should read (for instance)
/stream?title=Unknown 200 VLC 19261930 1193 129160 GET
And finally the sc_parse.pl script (all credits to its author!!!!) that does the trick, with some of my nonworking mods comented out:
#!/usr/bin/perl -w # -*- cperl - # # Parse ShoutCast v1.9.8 W3C log and append "GET" to each log line to +pretend it's a web logfile # # Usage: perl sc_parse.pl -c /full/path/to/shoutcast/sc_w3c.log # # Written by Ryan Gehrig # use Getopt::Std; our $opt_c; getopts('c:'); # Open the log file if (-e $opt_c) { print "Parsing log '$opt_c' ...\n"; open FILE, "$opt_c" or die "ERROR: Failed to open log file\n"; my @lines = <FILE>; foreach(@lines) { # Ignore comments if("$_" !~ /^\#/) { # Lose newlines $_ =~ s/\n//; #### Here I start editing/adding # First approach: # This works, but I cant add such a line for # every known and future player! there should be a way # to get the 7th word replaced anyway: # Look for flashplayer #$_ =~ s/MPEG\%20OVERRIDE/FlashPlayer/; # More realistic approach: # Obviously it fails for an unexperienced programmer as me +: # 7th string are players, focus on it #my @words = split(' ', $_); #if(@words[6] =~ m/MPEG\%20OVERRIDE/){ # @words[6] = "FlashPlayer"; #} #elsif(@words[6] =~ m/^vlc/){ # @words[6] = "VLC"; #} #I could add more occurences here #else { # @words[6] = "other"; #} #### Here ends my editing/adding # Write this line to new log file "sc_w3c.log_x" open (MYFILE, '>>'.$opt_c.'_x'); print MYFILE "$_" . ' GET' . "\n"; close (MYFILE); } } close(FILE); } else { print "ERROR: The specified log file doesnt exist. Exiting.\n"; }
Should I have it working... and if I manage AWSTATS to use the resulted log as intended, this would be incredible!!!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: a perl, awstats and SHOUTCast history
by Anonymous Monk on Mar 27, 2012 at 11:41 UTC | |
by Anonymous Monk on Mar 27, 2012 at 11:59 UTC | |
|
Re: a perl, awstats and SHOUTCast history
by remiah (Hermit) on Mar 28, 2012 at 02:11 UTC | |
by alexolivan (Initiate) on Mar 28, 2012 at 07:39 UTC | |
by alexolivan (Initiate) on Mar 28, 2012 at 11:40 UTC | |
by Anonymous Monk on Apr 09, 2012 at 02:08 UTC |