in reply to Parsing Apache logs with Regex

With the following code (unmodified from your post):

use strict; use warnings; my $log_pattern = q{(.*) \- \- \[(.*)\] \"(.*) (.*)\?(.*) HTTP\/(.*)\" + ([0-9]*) ([0-9]*) \"(.*)\" \"(.*)\" \"(.*)\"}; my $entry = '67.60.185.31 - - [14/Jan/2008:02:25:54 -0800] "GET /displ +ay.cgi?2643943|3334115 HTTP/1.1" 200 55 "-" "Mozilla/5.0 (Macintosh; +U; Intel Mac OS X; en-us) AppleWebKit/523.10.6 (KHTML, like Gecko)" " +67.60.185.31"'; $entry =~ /$log_pattern/; print $1, "\n"; print $2, "\n"; print $3, "\n"; print $4, "\n"; print $5, "\n"; print $6, "\n"; print $7, "\n"; print $8, "\n"; print $9, "\n"; print $10, "\n"; print $11, "\n";

I get the output

67.60.185.31 14/Jan/2008:02:25:54 -0800 GET /display.cgi 2643943|3334115 1.1 200 55 - Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-us) AppleWebKit/523.10.6 + (KHTML, like Gecko) 67.60.185.31

How are you calling your expression?

Replies are listed 'Best First'.
Re^2: Parsing Apache logs with Regex
by TheGorf (Novice) on Dec 31, 2008 at 20:32 UTC
    ok so I stepped back and used your example, but added in my file read and now with this code:

    #!/usr/bin/perl -w use strict; use warnings; my $log_pattern = q{(.*) \- \- \[(.*)\] \"(.*) (.*)\?(.*) HTTP\/(.*)\" + ([0-9]*) ([0-9]*) \"(.*)\" \"(.*)\" \"(.*)\"}; open (LOG, "< $ARGV[0]") or die "Cannot open file $ARGV[0]\n"; my @log = <LOG>; close ( LOG ); my $line; foreach $line (@log) { $line =~ /$log_pattern/; print $1."\n"; print $2."\n"; print $3."\n"; print $4."\n"; print $5."\n"; print $6."\n"; print $7."\n"; print $8."\n"; print $9."\n"; print $10."\n"; print $11."\n"; } close(SEM);
    I get this:
    Use of uninitialized value in concatenation (.) or string at parselogs + line 23. Use of uninitialized value in concatenation (.) or string at parselogs + line 24. Use of uninitialized value in concatenation (.) or string at parselogs + line 25. Use of uninitialized value in concatenation (.) or string at parselogs + line 26. Use of uninitialized value in concatenation (.) or string at parselogs + line 27.
    now I am very confused.

      The concatenation error result b/c you didn't match on $7-$11, so those variables didn't initialize, i.e. your regex failed to match. Are you sure your $lines match what you posted?

      In any case, the suggestions to use Apache::ParseLog are being given by very smart people. Unless there is a strong reason not to, I'd say do what they say.

        The problem with Apache::ParseLog is that all it does is generate reports and stuff. I need something that will allow me break up the line so i can insert it into a database where we are then going to crunch for these patterns that we are looking for.