As a first step, I'm trying to pull out the time and http referrer from web log data, but it's not going well since the only way I can see to do it is to strip out the unwanted parts of the log line and then use the timelocal function to convert the log time to real time to match whatever math is done to the current time. Here's what I have so far as a test:
The last $log_line does not work.#!/usr/local/bin/perl use CGI qw(:standard); use CGI::Carp qw(fatalsToBrowser carpout); use Time::Local; print "Content-type: text/html\n\n"; #$time = timelocal($sec,$min,$hour,$mday,$mon,$year); open LOGFILE, "datafile.html"; @log_data = <LOGFILE>; foreach $log_line(@log_data) { $log_line =~ s/.*(left square bracket)/ /; $log_line =~ s/"GET.*"h/ /; $log_line =~ s/".*/ /; print $log_line, "<p>"; } <p>
The datafile.html contains data in this form (square brackets are around the underlined date/times):
24.208.200.247 - - [10/Dec/2002:18:05:09 -0500] "GET /images/header_ao +d2_08.gif HTTP/1.0" 200 663 "http://www.indystar.com/help/help/availa +ble.html" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; H010818)" 24.208.200.247 - - [10/Dec/2002:18:05:09 -0500] "GET /images/header_ao +d2_10.gif HTTP/1.0" 304 - "http://www.indystar.com/help/help/availabl +e.html" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; H010818)" 24.208.200.247 - - [10/Dec/2002:18:05:09 -0500] "GET /images/storysear +ch2.gif HTTP/1.0" 200 142 "http://www.indystar.com/help/help/availabl +e.html" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; H010818)"
In reply to pulling by regex by mkent
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |