dbuk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, This is probably a no brainer for anyone with half a clue about Perl. Unfortunately I have neither clue nor brain this morning.

I have a tab delimted file containing the output from an HTTP proxy with data in the following format:

Date time User ID Source IP YYYY-MM-DD HH:MM:SS text 123.456.789.123 Dest.IP Uri-Stem Ref. Bytes 123.456.789.123 URL NUMBER NUMBER
Specifically I need to parse the time portion to pick out any activity between the hours of 08:45:00 and 12:00:00 and then 02:00:00 and 17:00:00.

I just cant seem to get my head around either the RegEx to parse it or the Time function that might make it easier. Can anyone help?

Many Thanks.

DB

Many Thanks to all who answered. I now have a clue, still no brain though :)

DB

Replies are listed 'Best First'.
Re: parsing time information from a TAB delimited array, kind of.
by davorg (Chancellor) on Sep 24, 2004 at 09:34 UTC

    Given your time format, you don't need to worry about actually handling it as a time. You can just handle it as a string and things will still just work.

    while (<LOGFILE>) { next unless /\b(\d\d:\d\d:\d\d)\b/; if (($1 ge '08:45:00' && $1 le '12:00:00') or ($1 ge '14:00:00' && $1 le '17:00:00')) { # do stuff. # line is in $_, time is in $1 } }
    --
    <http://www.dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: parsing time information from a TAB delimited array, kind of.
by Prior Nacre V (Hermit) on Sep 24, 2004 at 09:37 UTC

    Here's some (untested) skeleton code which should get you going.

    . . . use constant TIME_FIELD => 1; . . . # Read this data from a config file my %periods = qw(08:45:00 12:00:00 02:00:00 17:00:00); . . . while (my $line = <TAB_DEL_FILE>) { chomp $line; my @data = split /\t/, $line; my $time = $data[TIME_FIELD]; foreach my $start (keys %periods) { next unless $time ge $start && $time le $periods{$start}; process_line(\@data); last; } }

    Regards,

    PN5

Re: parsing time information from a TAB delimited array, kind of.
by Jasper (Chaplain) on Sep 24, 2004 at 09:15 UTC
    You don't want a regex to do this. You want an SQL query to do this. Run the file through DBX (I think), and then you can SELECT ... WHERE time BETWEEN on it like any other SQL database. A regex for betweens is a nightmare. I know, I've tried it before.
Re: parsing time information from a TAB delimited array, kind of.
by TedPride (Priest) on Sep 24, 2004 at 09:26 UTC
    Your formatting wasn't entirely clear, but I'm assuming the data will all be on one tab delimited line. I'm also assuming you mean between 17 and 02, not between 02 and 17.
    $line = "YYYY-MM-DD\t17:01:00\ttext\t123.456.789.123\t123.456.789.123\ +tURL\tNUMBER\tNUMBER"; $line =~ /.*?\t(\d\d):(\d\d):(\d\d)/; $time = $1*3600 + $2*60 + $3; if ($time >= 31500 && $time <= 43200 || $time <= 7200 || $time >= 6120 +0) { print "It's a match."; }
Re: parsing time information from a TAB delimited array, kind of.
by Random_Walk (Prior) on Sep 24, 2004 at 10:24 UTC

    making the assumption you really meant 08:45-12:00 and 17:00 - 02:00 filter your log through this or give it one or more log file as parameter(s)

    perl -pe'$t=+(split)[1];$t=~s/://g;if(($t<=20000)or($t>=84500 and$t<=120000)or($t>=170000)){}else{$_=""}'

    Cheers,
    R.

Re: parsing time information from a TAB delimited array, kind of.
by steves (Curate) on Sep 24, 2004 at 12:24 UTC

    If you want to do this in terms of real time values you could also use either Time::Local or Date::Manip to convert your string values to time values.