grabbing info from log after key word

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

How can I grab info from a log file including and after a keyword 'Mozilla' is recognized. Here is a sample line of from the log data:

ip here - - 02/Jan/2001:00:09:30 +0000 "POST /path HTTP/1.1" 200 132 Mozilla/4.0 (compatible; MSIE 5.01; Windows 98)

I'm trying to grab 'Mozilla' and everything after it, (but nothing before), count the characters in the captured data and report how many are over 100 char.

I've tried a variation of pattern matching and grep but continually fail. (i know...i really suck but i'm trying really hard ot learn this stuff) Here's what I have drooled out so far.

#!/usr/bin/perl
if ($_[0])
{
   $filename=$_[0];
}
else
{
   print "USAGE: getUserAgent <logfile name>\n";
   exit;
}

@lines=`cat /path/$filename`;
foreach $line (@lines)
{
   @elements=split(' ',$line);
   if ($elements[1] eq "Mozilla")
   {
    $capture = grep(/bMozilla/i\w+\W+\d+\s+\S+,userAgent);
   }
    printf "\n", @userAgent;
}
[download]

Needless to say this errors out badly:

Backslash found where operator expected at getUserAgent.pl line 18, near "/\bMozilla/i\" (Missing operator before \?) syntax error at getUserAgent.pl line 18, near "/\bMozilla/i\" Substitution replacement not terminated at getUserAgent.pl line 18.

This is my first post. Can one of you awsome Monks please help?

edmay98

Comment on grabbing info from log after key word Download Code

Replies are listed 'Best First'.
Re: grabbing info from log after key word by I0 (Priest) on Jan 03, 2001 at 06:43 UTC
`#!/usr/bin/perl if( $ARGV[0] ){ $filename=$ARGV[0]; }else{ print "USAGE: getUserAgent <logfile name>\n"; exit; } open FILE,"</path/$filename" or die "Can't open /path/$filename becaus +e $!"; $over=0; while( <FILE> ){ if( m/\b(Mozilla\b.*)/ ){ print "$1\n"; $over++ if( (length $1) > 100 ); } } print "$over over 100\n";` [download]	[reply] [d/l]
Re: Re: grabbing info from log after key word by edmay98 (Novice) on Jan 03, 2001 at 21:58 UTC
Thanks 10, you oh so honorable monk. I tried this and it worked great. May the new year bring you happiness, joy and plentiful amounts of excellent grog.	[reply]
Re: grabbing info from log after key word by chromatic (Archbishop) on Jan 03, 2001 at 08:01 UTC
The /i after Mozilla is interpreted as the terminating slash of the regex. I might grep through the lines, looking only for 'Mozilla', splitting the results on Mozilla, and taking the length of the second element after the split. If you have many lines to process, this will take lots of memory, though. You could do something more like this: `open(INPUT, "/path/$filename") or die "Can't open: $!"; while (<INPUT>) { if (/Mozilla/) { my $rest = (split(/Mozilla/, $_, 2))[1]; if (length($rest) > 100) { # tag this line somehow } } }` [download] That's untested and rather generic, but it's fairly close.	[reply] [d/l]
Re: grabbing info from log after key word by turnstep (Parson) on Jan 04, 2001 at 00:28 UTC
I'd throw the answers in a hash: for a typical logfile, you'll get a lot of the same answers anyway: open(LOGFILE, "$mylogfile") or die "Could not open $mylogfile: $!\n"; my %useragent; while(<LOGFILE>) { if (/Mozilla(.*)$/) { $useragent{$1}++; } } close(LOGFILE); ## Alphabetically sorted: for (sort keys %useragent) { printf "Length: %3d Frequency: %5d Name: $_\n", length $_, $useragen +t{$_}; } ## Sorted by frequency: for (sort {$useragent{$a} <=> $useragent{$b}} keys %useragent) { printf "Length: %3d Frequency: %5d Name: $_\n", length $_, $useragen +t{$_}; } ## Sorted by length: for (map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, length $_ ] } keys %useragent) { printf "Length: %3d Frequency: %5d Name: $_\n", length $_, $useragen +t{$_}; } [download]	[reply] [d/l]
Re: grabbing info from log after key word by EvanK (Chaplain) on Jan 04, 2001 at 00:18 UTC
try using a matching expression like: `m/Mozilla(.*)/;` [download] then, assign the `$capture` variable to `"Mozilla$1"` ______________________________________________ It's hard to believe that everyone here is the result of the smartest sperm.	[reply] [d/l] [select]