jc7 has asked for the wisdom of the Perl Monks concerning the following question:

What we are trying to do is to scan a log file. It skips the old entries, and once it finds the first entry (such as line n) within the date/time range, it starts scanning the remaining file, including the first entry (line n) within the range. The following program does everything right except for one thing. It appears to skip the first entry (line n) that in the date/time range and scan only from the second line (line n+1). What change needs to be made to the program so it can start scanning from the first line (line n) that within the data/time range? Thank you in advance for your help and prompt reply!
# skip old entries: while (<LOG>) { if (/^\s*([\d\/\-]+\s+[\d\.\:]+)\s+/) { next if ( (time() - str2time($1)) > $serverRef->{ScanErrlogLast +Days}*24*3600 ); $ref->{start_check_datetime} = $1; last; } } # scan the entries: while (<LOG>) { …
The <LOG> has format like:
……
yyyy-mm-dd hh:mm:ss.nn xxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ; line n
yyyy-mm-dd hh:mm:ss.nn xxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ; line n+1
……

Replies are listed 'Best First'.
Re: Skip and Scan Lines
by dragonchild (Archbishop) on Mar 18, 2004 at 20:06 UTC
    while (<LOG>) consumes the line from the buffer. So, you find the starting line in the first while-loop, then continue from the next line in the second while loop.

    Is there a reason you have the while-loops separated? Why not combine them and instead of breaking out when you find the first line, set a flag that says "I should be handling these lines" once you find the first good line. So, instead of "last when found", do "next unless found or this line is ok".

    ------
    We are the carpenters and bricklayers of the Information Age.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      Thanks for your suggestion! Then how to handle the command $ref->{start_chack_datetime} = $1? We only need to set it once when we find the first line. Also, how to set the flag? I am new to perl and would very appreciate if you can give me more details!
Re: Skip and Scan Lines
by graff (Chancellor) on Mar 19, 2004 at 01:32 UTC
    Actually, I see a couple problems with the logic in your script. First (a minor point), you should call "time()" and compute the comparison value just once, before going into the loop, and store these a scalars.

    Second, if you restructure the logic a little bit, you can do it all with just one while loop -- something like this:

    my $ref_time = time(); my $ref_delta = $serverRef->{ScanErrlogLastDays}*24*3600; while (<LOG>) { next unless ( /^\s*([\d\/\-]+\s+[\d\.\:]+)\s+/ ); next if ( $ref_time - str2time( $1 ) > $ref_delta ); $ref->{start_check_datetime} = $1; # put all the stuff from the second while loop here } # all done, and you didn't miss the first record
    (update: fixed the second condition (from "unless" to "if") to match the OP's intent. Actually, fixed two other typos in that same line as well -- not having a way to test my work makes me less reliable!)

    another update: Since consecutive lines in the input log file are guaranteed to be in chronological order, it's probably worthwhile to avoid doing the time arithmetic over and over, once you hit the first sought-for record -- I realize that this must have been (part of) the reasoning behind having two while loops in your approach. I also realize that my initial suggestion would keep assigning new values to ref->{start_check_datetime} -- which is probably a mistake. You could fix this as follows:

    my $seeking = 1; my $ref_time = time(); my $ref_delta = $serverRef->{ScanErrlogLastDays}*24*3600; while (<LOG>) { if ( $seeking ) { next unless ( /^\s*([\d\/\-]+\s+[\d\.\:]+)\s+/ ); next if ( $ref_time - str2time( $1 ) > $ref_delta ); $ref->{start_check_datetime} = $1; $seeking = 0; } # put all the stuff from the second while loop here }
      It works!! Thank you so much for your help!!!
Re: Skip and Scan Lines
by ambrus (Abbot) on Mar 18, 2004 at 20:48 UTC

    You could try to use a do {  } while <LOG> loop instead of the second while (<LOG>) {  }.

      Thanks for your suggestion! Would yo uplease give me a little bit more details? I would very appreciate it!

        When the first loop finds the first line you wants to process, it exits with the line still being in $_. Then the condition of the second while loop reads the next line to $_ thus discarding the first one. You should write

        while (<LOG>) { ... processing log line .... }
        to
        do { ... processing log line .... } while <LOG>;
        Because of the special way perl handles do {} while, the condition is evaluated only after the body has run first, so the body processed the line that's still in $_.