Survivor has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I have 3 files (these are the log files of the system). These files have date formate like this (Dec 5 09:02:01). I need to read these files from from the first date(Dec 5 09:02:01) till the last date (Dec 17 17:34:02) in the file. Then i have to search a word between the time difference of one minute and i have to count that how many times that word is present between these two time period.

For example : Start time End Time Number of times word found. Dec 5 09:02:01 Dec 5 09:03:00 10 (first entry) Dec 5 09:03:01 Dec 5 09:03:00 5 (second entry) Dec 5 09:04:01 Dec 5 09:04:00 3 (third entry)

and so on till the last entry of the date (Dec 17 17:34:02)

How i can handle these dates. I have read about the Time::Local but that did not solve my problem. I will be very thankful for any help.

Many thanks,

Replies are listed 'Best First'.
Re: Problems in dates and time additions by one minute
by ww (Archbishop) on Dec 18, 2011 at 20:42 UTC

    I have a problem applying this to a log where two out of three entries end before they start; that is, in which the start times are later than the corresponding end times

    Assuming that's a mere glitch in posting, please help me understand your log format a bit better:

    • Does each entry start :59 seconds after the previous, so you have complete*1 coverage, in (roughly) one minute increments throughout the period you want to analyse?
          *1 Is the coverage in 60 second time units or is there a gap of a second between each entry... and is that significant?
    • Does all the data for a given entry appear on a single (logical) line; that is, without embedded newlines?
    • From your problem description, I gather that some entries will NOT contain any given specific (arbitrary) word but that some or all will have multiple instances of some word. It's also unclear whether any entry can contain multiple instances of more than one word. So, How many discrete words can occur in any given log entry? Is there a limit to the number of instances of a particular word within a single log entry? Can be multiple instances of more than one word in an entry?
    The answers to that last set of questions bear directly on how to attack the problem.

    I'm also unclear why you feel Time::Local is inadquate (assuming the end times are later than the start times). ...from its documentation:

    SYNOPSIS $time = timelocal($sec,$min,$hour,$mday,$mon,$year); $time = timegm($sec,$min,$hour,$mday,$mon,$year); DESCRIPTION This module provides functions that are the inverse of built-in pe +rl functions "localtime()" and "gmtime()". They accept a date as a six-element array, and return the corresponding time(2) value in s +econds since the system epoch (Midnight, January 1, 1970 GMT on Unix, for example). This value can be positive or negative, though POSIX onl +y requires support for positive values, so dates before the system's + epoch may not work on all operating systems.

    So, split the date-and-time component of the log entry to feed to Time::Local for conversion to epoch-seconds; add 60 (or 59) to identify the one minute span (unless your log is a series of one minute spans, in which case, why do you need to worry about conversions?), and proceed.

    As always, sample data and a clear description of your problem will help us to help you.

    Moving on, and for a WAG only, it may be that using grep (grep), or regexen (perldoc perlretut, among others)and a hash will suffice for the word-counting aspect of your question... but that begs other questions often asked here: "What have you tried? Where's your code?"

    Updated: restored (approximately) the section originally lost to a bad job of editing

Re: Problems in dates and time additions by one minute
by TJPride (Pilgrim) on Dec 18, 2011 at 20:50 UTC
    You didn't give us source data, only the end result, so I had to extrapolate the source data. Here's some code that should hopefully do more or less what you want - expanding it further is left up to you.

    use Time::Local; use strict; use warnings; my $word = 'bravo'; my $year = (localtime())[5] + 1900; my @months = qw/Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec/; my %months; $months{$months[$_]} = $_ for 0..11; my ($ts, $min, $max, %results, $d1, $d2); while (<DATA>) { chomp; ### Calculate timestamp for start of given minute if (m/(\w+)\s+(\d+)\s+(\d+):(\d+):(\d+)\s+(.*)/) { $ts = timelocal(0, $4, $3, $2, $months{$1}, $year); ### Count instances of word for (split /\s+/, $6) { $results{$ts}++ if uc $_ eq uc $word; } } } ### Sort results by timestamp for $ts (sort { $a <=> $b } keys %results) { ### Start date / time (00) $d1 = [localtime($ts)]; $d1 = $months[$d1->[4]] . sprintf(' %02d %02d:%02d:%02d', $d1->[3], $d1->[2], $d1->[1], $d1->[0]); ### End date / time (59) $d2 = [localtime($ts + 59)]; $d2 = $months[$d2->[4]] . sprintf(' %02d %02d:%02d:%02d', $d2->[3], $d2->[2], $d2->[1], $d2->[0]); ### If you want first, second, third, etc, you'll have ### to implement that part yourself print "$d1 $d2 $results{$ts} matches\n"; } __DATA__ Dec 5 09:02:01 alpha bravo charlie Dec 5 09:02:02 bravo Dec 17 17:34:02 bravo charlie tango
      Thanks for every one for helping me. The reply form "TJPride" has solved my problem, Thank you very much. Now i can further modify it. Could you please explain me these lines a bit more.
      for (split /\s+/, $6) { $results{$ts}++ if uc $_ eq uc $word; for $ts (sort { $a <=> $b } keys %results) { ### Start date / time (00) $d1 = [localtime($ts)]; $d1 = $months[$d1->[4]] . sprintf(' %02d %02d:%02d:%02d', $d1->[3], $d1->[2], $d1->[1], $d1->[0]);
        What in particular is not clear to you in these snippets?

        In the first part, $6 contains whatever was matched by the 6th set of parens in the previous regex match (basically, everything in the log entry following the time stamp string), $ts is the "seconds since the epoch" time-stamp that corresponds to the matched time stamp string in the log entry, and the if statement returns true on any case-insensitive match of a "word" token from the log entry to the target $word.

        The second part is a loop over hash keys that are being sorted numerically. The first line in the loop creates an array ref containing the list of values returned by localtime, and then uses selected values from that array to create a string that is formatted the same way as the original log entry time stamp string.

Re: Problems in dates and time additions by one minute
by CountZero (Bishop) on Dec 19, 2011 at 10:22 UTC
    You have us all guessing here, so it would help if you could post a (shortish) example of the log you are using. It would even be better if you could also post what you have tried to solve this task.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James