Main logic of this script

johnprince1980 has asked for the wisdom of the Perl Monks concerning the following question:
Hi, I only know Perl from an administration perspective, and trying to understand the below Perl Script primarily the main logic. Though I understand the syntax, but have challenges in understanding the flow. Could someone please explain me an overview of the this script ( Just a brief one would suffice) Here is the code :

#!/usr/bin/perl
#
# The Whizbang Directory Server Access Log Parser
#
# We often have to use multiple pipelines to do things like:
#
# - Find and print all operation entries (including the result line) t
+hat 
# have a certain characteristic, such as a result code or etime.
#
# - Find and print all connections that have certain characteristics,
# such as a specific source IP or BIND DN.
#
# Using pipelines for this works fine functionally, but can result
# in many passes through the access log.  When the access log is
# large, this can take a very long time.  This script attempts to
# perform such operations with only one pass through the log.
#
# The caveat is that the script must store the contents of an
# operation or connection in a data structure until the script
# determines whether to print the contents or not.  So, in a worst
# case scenario, if connection output has been requested, and the 
# script is unable to determine whether or not to print any of the
# connections in the log until the very end, the perl process may
# have the entire log stored in memory.
#
# Use Standard Getopts
use Getopt::Std;
use Time::Local;
#
# Minimize output buffering
$| = 1;
# Month abbreviations
my %mon_abbr = ( Jan => '0', Feb => '1', Mar => '2', Apr => '3', May =
+> '4', Jun => '5',
                 Jul => '6', Aug => '7', Sep => '8', Oct => '9', Nov =
+> '10', Dec => '11' );
my @abbr_mon = qw( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec );
# Usage
$usage = "$0 [-O <Output Type>] [-S [DD/Mmm/yyyy:]hh:mm:ss] [-E [DD/Mm
+m/yyyy:]hh:mm:ss] [-vidh] expression [file]
    -O Output type - one of connection, operation, sort, stat, or echo
+ - defaults to operation
    -v negative match (unused for sort, stat or echo)
    -i case-insensitive match (unused for sort, stat or echo)
    -I interval for stat reporting, default is 60
    -S Start datetime for log processing
    -E End datetime for log processing
    -d debug script logging
    -h display this help
    expression - Expression to match (unused for sort, stat and echo)
    file - Access log to parse - if missing, we assume STDIN

Notes:
    -O sort produces a sorted, delimited output of connections, only a
+fter EOF on the
       input access log.  This means that all log input is stored in m
+emory at the 
       time printing starts.  It also means that there is no rolling o
+utput such as
       that produced by -O connection or -O operation.  So whereas tai
+l -f piped to
       -O connection or -O operation produces output, tail -f access p
+iped to -O sort
       will never produce output.
    -E and -S datetime arguments can be abbreviated by omitting the da
+te portion.  In
       this case, the first line of the log will be used to generate t
+he datetime constraints.
       This means if you have a log that starts as 23:50 and you want 
+to grab a time range from
       the following morning, you will need to specify the datetime in
+ full on the command line.
    -O stat produces the following statistics:
        date_end - date at the end of the statistics gathering interva
+l
        time_end - timestamp at the end of the statistics gathering in
+terval
        conn_opn - number of connections opened during the statistics 
+gathering interval
        conn_cls - number of connections closed during the statistics 
+gathering interval
        tot_ops - number of operations initiated during the statistics
+ gathering interval
        tot_res - number of results sent during the statistics gatheri
+ng interval
        tot_bind - number of BIND operations initiated during the stat
+istics gathering interval
        tot_srch - number of SRCH operations initiated during the stat
+istics gathering interval
        tot_add - number of ADD operations initiated during the statis
+tics gathering interval
        tot_mod - number of MOD operations initiated during the statis
+tics gathering interval
        tot_del - number of DEL operations initiated during the statis
+tics gathering interval
        etime_av - average etime of results sent during the statistics
+ gathering interval
        etime_mx - maximum etime of results sent during the statistics
+ gathering interval
        un_idx_s - number of unindexed search results completed during
+ the statistics gathering interval
        nentries - number of entries sent during the statistics gather
+ing interval

Examples:

1) Print all operations from the large log file \"access.foo\" that ha
+ve the expression
\" BIND \" in them and also have the expression \" err=32 \" in them. 
+Typically, this means
all BIND operations that failed with err=32.

cat access.foo \| $0 \" BIND \" \| $0 \" err=32 \"

Note that while we are making a second pass through the output of the 
+first iteration, 
an equivalent pipeline of grep, awk, and/or xargs would require many p
+asses through the 
original, large log file.

2) Find all operations from the log file \"access.foo\" that have the 
+expression
\" ADD \", \" MOD \", or \" DEL \" in them, with an LDAP result code o
+ther than 0 or 50. 

$0 \" ADD | MOD | DEL \" access.foo \| $0 -v \" err=0 \| err=50 \"

3) Find all operations from the log file \"access.foo\" that have the 
+expression
\" etime=<n>\" in them, where <n> is not equal to zero.  Typically, th
+is returns all
operations with etime of 1 second or greater.

$0 \" etime=[^0]\" access.foo

Note that this form can be significantly more efficient in how much me
+mory it uses
compared to the -v form.

4) Watch a live access log for wrong password logins.

tail -f access \| $0 \" err=49 \"

5) Grab log output between two datetimes and echo it

$0 -O echo -S 15/Nov/2008:19:00:00 -E 15/Nov/2008:19:59:59 access.foo

6) Examine workload statistics in one-hour increments

$0 -O stat -I 3600 access.foo

";
#
# Process Options
getopts('O:I:S:E:vidh');
#
# Print help
if ($opt_h) {
    print $usage;
    exit(0);
}
# Determine output type
if ($opt_O) {
    if ($opt_O =~ /^connection$|^conn$|^c$/i) {
        $output_format = 'connection';
    }
    elsif ($opt_O =~ /^operation$|^op$|^o$/i) {
        $output_format = 'operation';
    }
    elsif ($opt_O =~ /^sort$|^so$/i) {
        $output_format = 'sort';
    }
    elsif ($opt_O =~ /^statistics$|^statistic$|^stats$|^stat$|^st$/i) 
+{
        $output_format = 'stat';
        $stat_interval = defined($opt_I) ? $opt_I : '60';
    }
    elsif ($opt_O =~ /^e|^echo$/i) {
        $output_format = 'echo';
    }
    else {
        print "Unknown output type: $opt_O\n$usage";
        exit(2);
    }
}
else {
    $output_format = 'operation';
}
verbose("Output format is $output_format\n");
#
# Parse date range arguments
if ($opt_S =~ /^(\d+)\/([A-Za-z]+)\/(\d+)\:(\d+):(\d+):(\d+)$/) {
    $lower_dy = $1;
    $lower_mo = $2;
    $lower_yr = $3;
    $lower_hour = $4;
    $lower_min = $5;
    $lower_sec = $6;
}
elsif ($opt_S =~ /^(\d+):(\d+):(\d+)$/) {
    $lower_hour = $1;
    $lower_min = $2;
    $lower_sec = $3;
}
else {
    if ($opt_S) {
        print "Invalid date format.\n$usage";
        exit(2);
    }
}
if ($opt_E =~ /^(\d+)\/([A-Za-z]+)\/(\d+)\:(\d+):(\d+):(\d+)$/) {
    $upper_dy = $1;
    $upper_mo = $2;
    $upper_yr = $3;
    $upper_hour = $4;
    $upper_min = $5;
    $upper_sec = $6;
}
elsif ($opt_E =~ /^(\d+):(\d+):(\d+)$/) {
    $upper_hour = $1;
    $upper_min = $2;
    $upper_sec = $3;
}
else {
    if ($opt_E) {
        print "Invalid date format.\n$usage";
        exit(2);
    }
}
# Date range calculations
my $lower_epoch;
my $upper_epoch;
# If we have start and/or end dates specified including YYYY MM DD, ca
+lculate lower and/or upper epoch times
if ($opt_S) {
    if ($lower_yr) {
        $lower_epoch = timelocal($lower_sec,$lower_min,$lower_hour,$lo
+wer_dy,$mon_abbr{$lower_mo},$lower_yr - '1900');
    }
}
if ($opt_E) {
    if ($upper_yr) {
        $upper_epoch = timelocal($upper_sec,$upper_min,$upper_hour,$up
+per_dy,$mon_abbr{$upper_mo},$upper_yr - '1900');
    }
}
# Process arguments
if ($output_format =~ /^connection$|^operation$/) {
    # Process argv0 as expression
    unless ($ARGV[0]) {
        die "$output_format output requires a filter expression\n$usag
+e";
    }
    $filt_expr = $ARGV[0];
    if ($ARGV[1]) {
        $in_file = $ARGV[1];
        verbose("Processing file via argv1: $in_file\n");
        open(ACCESS_LOG,"$in_file") or die "Failed to open $in_file\: 
+$!";
    }
    else {
        verbose("Processing file via STDIN\n");
        open(ACCESS_LOG,"-") or die "Failed to get STDIN\: $!";
    }
}
elsif ($output_format =~ /^sort$|^echo$|^stat$/) {
    # Process argv0 as file
    if ($ARGV[0]) {
        $in_file = $ARGV[0];
        verbose("Processing file via argv0: $in_file\n");
        open(ACCESS_LOG,"$in_file") or die "Failed to open $in_file\: 
+$!";
    }
    else {
        verbose("Processing file via STDIN\n");
        open(ACCESS_LOG,"-") or die "Failed to get STDIN\: $!";
    }
}

# Do operation output type processing
if ($output_format eq 'operation') {
    my @skipping;
    my @printing;
    my %pending;
    while (<ACCESS_LOG>) {
        my $log_line = $_;
        verbose("Processing line: $log_line");
        # Use a regular expression to parse the log line
        unless ($log_line =~ /^\[(\d+)\/([A-Za-z]+)\/(\d+)\:(\d+):(\d+
+):(\d+) ([+|-]\d+)\] conn\=([-]*\d+) op\=([-]*\d+) msgId\=([-]*\d+) \
+- (.+)$/) {
            # We didn't match the regex
            verbose("Line failed to parse:\n$log_line");
            next;
        }
        # We captured date, time, connection, and operation data
        my $day = $1;
        my $month = $2;
        my $year = $3;
        my $hour = $4;
        my $minute = $5;
        my $second = $6;
        my $tz_offset = $7;
        my $conn_number = $8;
        my $op_number = $9;
        my $msgid_number = $10;
        my $log_payload = $11;
        #
        # Date-time constraints
        # If we don't have a date yet, take it from the log
        if ($opt_S) {
            unless ($lower_epoch) {
                verbose("Lower bound not present.  Taking from log.\n"
+);
                $lower_epoch = timelocal($lower_sec,$lower_min,$lower_
+hour,$day,$mon_abbr{$month},$year - '1900');
            }
        }
        if ($opt_E) {
            unless ($upper_epoch) {
                verbose("Upper bound not present.  Taking from log.\n"
+);
                $upper_epoch = timelocal($upper_sec,$upper_min,$upper_
+hour,$day,$mon_abbr{$month},$year - '1900');
            }
        }
        # Calculate the current log time in epoch
        $current_epoch = timelocal($second,$minute,$hour,$day,$mon_abb
+r{$month},$year - '1900');
        # Is the current line in the correct range
        if ((($opt_S) && ($current_epoch < $lower_epoch)) || (($opt_E)
+ && ($current_epoch > $upper_epoch))) {
            next;
        }
        #
        # Here we decide what to do with the operation.  We have three
+ data structures,
        # @skipping, @printing and %pending.  Elements of @skipping, @
+printing and keys 
        # to %pending are $conn_number:$op_number:$msgid_number. 
        #
        # In the case of a standard match, when a line is associated 
        # with an operation in @printing, we simply print it.  When a 
+line that is not
        # already associated with an operation being printed matches o
+ur printing 
        # criteria, we add the operation to @printing and check if it'
+s in %pending.
        # If the newly matched operation was in %pending, we print out
+ its previous lines
        # and remove it from %pending.
        #
        # In the case of a negative match, when a line is associated w
+ith an operation
        # in @skipping, we do not print it and go to the next line.  I
+f a line fails
        # to match negatively, its operation key goes into @skipping, 
+because unless
        # every line matches negatively, we will not print the operati
+on.  If the line 
        # has an entry in %pending, that entry is removed.  If a line 
+matches negatively
        # and its operation is not in @skipping, it is either added to
+ %pending as a new
        # or appended to its operation's entry if it already exists.  
+Once the final line
        # of the operation is parsed, if all lines matched negatively,
+ the operation is 
        # printed and purged from %pending. If we reach the end of the
+ log or a server
        # restart, pending operations are printed.
        #
        # Compute the operation key
        my $operation_key;
        if (($op_number eq '-1') || ($msgid_number eq '-1')) {
            # Treat all internal operations as a single operation
            $operation_key = $conn_number . ':' . 'INTERNAL';
        }
        else {
            $operation_key = $conn_number . ':' . $op_number . ':' . $
+msgid_number;
        }
        #
        # If this is a server restart, blow away old operation data
        if (($operation_key eq '0:INTERNAL') && ($log_payload =~ /conn
+ection from/)) {
            verbose("Server restart detected.  Initializing operation 
+state.\n");
            # If we are doing negative matching, print any pending ope
+rations
            if ($opt_v) {
                foreach $pending_key (keys(%pending)) {
                    print $pending{$pending_key};
                }
            }
            undef @skipping;
            undef @printing;
            undef %pending;
        }
        #
        # Set up a flag to say if this line is a result
        undef my $this_is_a_result;
        # Check if the line is a result
        if (($log_payload =~ /^RESULT |^closed\.$|^UNBIND$|^ABANDON /)
+) {
            $this_is_a_result = 'true';
            verbose("This line is a result.\n");
        }
        # Set up a flag to say if this operation is already on the ski
+p list
        undef my $skip_this;
        # Set up a flag to say if this operation is already on the pri
+nt list
        undef my $this_is_printing;
        # Set up a flag to say if this operation is already on the pen
+ding list
        undef my $this_is_pending;
        #
        # Check if the operation is already in @skipping
        if (@skipping) {
            for $i (0..$#skipping) {
                verbose("skip list slot $i - $skipping[$i]\n");
                if ($skipping[$i] eq $operation_key) {
                    $skip_this = 'true';
                    verbose("This operation is already marked for skip
+ping.\n");
                    # If this is a result, remove the operation from @
+skipping
                    if ($this_is_a_result) {
                        splice(@skipping,$i,1);
                        verbose("The operation has been removed from t
+he skipping list.\n");
                    }
                    last;
                }
            }
        }
        #
        # Check if the operation is already in @printing
        if (@printing) {
            for $i (0..$#printing) {
                verbose("print list slot $i - $printing[$i]\n");
                if ($printing[$i] eq $operation_key) {
                    $this_is_printing = 'true';
                    verbose("This operation is already marked for prin
+ting.\n");
                    # If this is a result, remove the operation from @
+printing
                    if ($this_is_a_result) {
                        splice(@printing,$i,1);
                        verbose("The operation has been removed from t
+he printing list.\n");
                    }
                    last;
                }
            }
        }
        #
        # If the operation is already on the print list, print the lin
+e and stop processing it
        if ($this_is_printing) {
            print $log_line;
            next;
        }
        # If the operation is already on the skip list, skip the line 
+and stop processing it
        if ($skip_this) {
            next;
        }
        #
        # Check if there is a pending entry for this operation
        if (%pending) {
            if (exists($pending{$operation_key})) {
                $this_is_pending = 'true';
            }
        }
        #
        # Check if the operation matches our criteria.  At this point,
+ all this means is a
        # pattern match against the current log line.
        # Set up a flag to say whether the line matched
        undef my $this_is_a_match;
        if ($opt_v) {
            # Process as a negative match
            if ($opt_i) {
                # Process case-insensitive
                if ($log_line !~ /$filt_expr/i) {
                    verbose("Line matched.\n");
                    $this_is_a_match = 'true';
                }
            }
            else {
                # Process case-sensitive
                if ($log_line !~ /$filt_expr/) {
                    verbose("Line matched.\n");
                    $this_is_a_match = 'true';
                }
            }
        }
        else {
            # Process as a positive match
            if ($opt_i) {
                # Process case-insensitive
                if ($log_line =~ /$filt_expr/i) {
                    verbose("Line matched.\n");
                    $this_is_a_match = 'true';
                }
            }
            else {
                # Process case-sensitive
                if ($log_line =~ /$filt_expr/) {
                    verbose("Line matched.\n");
                    $this_is_a_match = 'true';
                }
            }
        }
        if ($this_is_a_match) {
            # This line matches our criteria
            #
            # The -v behavior is different.  We expect the operation t
+o be returned
            # if every line in the operation matches the expression ne
+gatively, i.e.,
            # only if the expression is missing from every line in the
+ operation.
            if ($opt_v) {
                # Negative matching behavior - this line matched the e
+xpression
                # negatively.
                # In this case, we never put anything on @printing, be
+cause
                # we never print before we have collected all lines as
+sociated
                # with the operation.  Once we have all lines we can p
+rint.
                #
                # If this is a result, check for a pending entry and p
+rint the operation
                if (($this_is_a_result) && ($this_is_pending)) {
                    print $pending{$operation_key};
                    print $log_line;
                    delete $pending{$operation_key};
                }
                # If this is a result with no pending entry, just prin
+t the line.
                elsif ($this_is_a_result) {
                    print $log_line;
                }
                # If this is a not a result, add it to %pending or app
+end to its
                # existing key.
                else {
                    if ($this_is_pending) {
                        $pending{$operation_key} .= $log_line;
                    }
                    else {
                        $pending{$operation_key} = $log_line;
                    }
                }
            }
            else {
                # Standard matching behavior
                # Push the operation key onto @printing unless this is
+ a result
                unless ($this_is_a_result) {
                    verbose("Pushing $operation_key onto the printing 
+list.\n");
                    push(@printing,$operation_key);
                }
                #
                # If there is a pending entry for this operation print
+ it and remove it
                if ($this_is_pending) {
                    print $pending{$operation_key};
                    delete $pending{$operation_key};
                }
                # 
                # Print this line
                print $log_line;
            }
        }
        else {
            # This line does not match our criteria
            #
            if ($opt_v) {
                # Negative non-matching behavior
                # Push the operation key onto @skipping unless this is
+ a result
                unless ($this_is_a_result) {
                    verbose("Pushing $operation_key onto the skipping 
+list.\n");
                    push(@skipping,$operation_key);
                }
                # If this a pending operation, remove from the pending
+ hash
                if ($this_is_pending) {
                    delete $pending{$operation_key};
                }
            }
            else {
                # Positive non-matching behavior
                # If this is a result on a pending operation, remove f
+rom the pending hash
                if (($this_is_a_result) && ($this_is_pending)) {
                    delete $pending{$operation_key};
                }
                # If this is not a result, but is pending, append to t
+he pending scalar
                elsif ($this_is_pending) {
                    $pending{$operation_key} .= $log_line;
                }
                # If this is not a result, and is not pending, create 
+the pending entry
                else {
                    $pending{$operation_key} = $log_line;
                }
            }
        }
    }
    # If we are doing negative matching, print any pending operations
    if ($opt_v) {
        foreach $pending_key (keys(%pending)) {
            print $pending{$pending_key};
        }
    }
}
# Do connection output type processing
if ($output_format eq 'connection') {
    my @skipping;
    my @printing;
    my %pending;
    while (<ACCESS_LOG>) {
        my $log_line = $_;
        verbose("Processing line: $log_line");
        # Use a regular expression to parse the log line
        unless ($log_line =~ /^\[(\d+)\/([A-Za-z]+)\/(\d+)\:(\d+):(\d+
+):(\d+) ([+|-]\d+)\] conn\=([-]*\d+) op\=([-]*\d+) msgId\=([-]*\d+) \
+- (.+)$/) {
            # We didn't match the regex
            verbose("Line failed to parse:\n$log_line");
            next;
        }
        # We captured date, time, connection, and operation data
        my $day = $1;
        my $month = $2;
        my $year = $3;
        my $hour = $4;
        my $minute = $5;
        my $second = $6;
        my $tz_offset = $7;
        my $conn_number = $8;
        my $op_number = $9;
        my $msgid_number = $10;
        my $log_payload = $11;
        #
        # Date-time constraints
        # If we don't have a date yet, take it from the log
        if ($opt_S) {
            unless ($lower_epoch) {
                verbose("Lower bound not present.  Taking from log.\n"
+);
                $lower_epoch = timelocal($lower_sec,$lower_min,$lower_
+hour,$day,$mon_abbr{$month},$year - '1900');
            }
        }
        if ($opt_E) {
            unless ($upper_epoch) {
                verbose("Upper bound not present.  Taking from log.\n"
+);
                $upper_epoch = timelocal($upper_sec,$upper_min,$upper_
+hour,$day,$mon_abbr{$month},$year - '1900');
            }
        }
        # Calculate the current log time in epoch
        $current_epoch = timelocal($second,$minute,$hour,$day,$mon_abb
+r{$month},$year - '1900');
        # Is the current line in the correct range
        if ((($opt_S) && ($current_epoch < $lower_epoch)) || (($opt_E)
+ && ($current_epoch > $upper_epoch))) {
            next;
        }
        #
        # Here we decide what to do with the connection.  We have thre
+e data structures,
        # @skipping, @printing and %pending.  Elements of @skipping, @
+printing and keys
        # to %pending are $conn_number.
        #
        # In the case of a standard match, when a line is associated
        # with a connection in @printing, we simply print it.  When a 
+line that is not
        # already associated with a connection being printed matches o
+ur printing
        # criteria, we add the connection to @printing and check if it
+'s in %pending.
        # If the newly matched connection was in %pending, we print ou
+t its previous lines
        # and remove it from %pending.
        #
        # In the case of a negative match, when a line is associated w
+ith a connection
        # in @skipping, we do not print it and go to the next line.  I
+f a line fails
        # to match negatively, its connection key goes into @skipping,
+ because unless
        # every line matches negatively, we will not print the connect
+ion.  If the line
        # has an entry in %pending, that entry is removed.  If a line 
+matches negatively
        # and its connection is not in @skipping, it is either added t
+o %pending as new
        # or appended to its connection's entry if it already exists. 
+ Once the final line
        # of the connection is parsed, if all lines matched negatively
+, the connection is
        # printed and purged from %pending. If we reach the end of the
+ log or a server
        # restart, pending connections are printed.
        #
        # Compute the connection key
        my $connection_key;
        $connection_key = $conn_number;
        #
        # If this is a server restart, blow away old operation data
        if (($connection_key eq '0') && ($log_payload =~ /connection f
+rom/)) {
            verbose("Server restart detected.  Initializing connection
+ state.\n");
            # If we are doing negative matching, print any pending con
+nections 
            if ($opt_v) {
                foreach $pending_key (keys(%pending)) {
                    print $pending{$pending_key};
                }
            }
            undef @skipping;
            undef @printing;
            undef %pending;
        }
        #
        # Set up a flag to say if this line is a close
        undef my $this_is_a_close;
        # Check if the line is a close
        if (($op_number eq '-1') && ($msgid_number eq '-1') && ($log_p
+ayload eq 'closed.')) {
            $this_is_a_close = 'true';
            verbose("This line is a connection close.\n");
        }
        # Set up a flag to say if this connection is already on the sk
+ip list
        undef my $skip_this;
        # Set up a flag to say if this connection is already on the pr
+int list
        undef my $this_is_printing;
        # Set up a flag to say if this connection is already on the pe
+nding list
        undef my $this_is_pending;
        #
        # Check if the connection is already in @skipping
        if (@skipping) {
            for $i (0..$#skipping) {
                verbose("print list slot $i - $skipping[$i]\n");
                if ($skipping[$i] eq $connection_key) {
                    $skip_this = 'true';
                    verbose("This connection is already marked for ski
+pping.\n");
                    last;
                }
            }
        }
        #
        # Check if the connection is already in @printing
        if (@printing) {
            for $i (0..$#printing) {
                verbose("print list slot $i - $printing[$i]\n");
                if ($printing[$i] eq $connection_key) {
                    $this_is_printing = 'true';
                    verbose("This connection is already marked for pri
+nting.\n");
                    # If this is a close, remove the connection from @
+printing
                    if ($this_is_a_close) {
                        splice(@printing,$i,1);
                        verbose("The connection has been removed from 
+the printing list.\n");
                    }
                    last;
                }
            }
        }
        #
        # If the connection is already on the print list, print the li
+ne and stop processing it
        if ($this_is_printing) {
            print $log_line;
            next;
        }
        # If the connection is already on the skip list, skip the line
+ and stop processing it
        if ($skip_this) {
            next;
        }
        #
        # Check if there is a pending entry for this connection
        if (%pending) {
            if (exists($pending{$connection_key})) {
                $this_is_pending = 'true';
            }
        }
        #
        # Check if the connection matches our criteria.  At this point
+, all this means is a
        # pattern match against the current log line.
        # Set up a flag to say whether the line matched
        undef my $this_is_a_match;
        if ($opt_v) {
            # Process as a negative match
            if ($opt_i) {
                # Process case-insensitive
                if ($log_line !~ /$filt_expr/i) {
                    verbose("Line matched.\n");
                    $this_is_a_match = 'true';
                }
            }
            else {
                # Process case-sensitive
                if ($log_line !~ /$filt_expr/) {
                    verbose("Line matched.\n");
                    $this_is_a_match = 'true';
                }
            }
        }
        else {
            # Process as a positive match
            if ($opt_i) {
                # Process case-insensitive
                if ($log_line =~ /$filt_expr/i) {
                    verbose("Line matched.\n");
                    $this_is_a_match = 'true';
                }
            }
            else {
                # Process case-sensitive
                if ($log_line =~ /$filt_expr/) {
                    verbose("Line matched.\n");
                    $this_is_a_match = 'true';
                }
            }
        }
        if ($this_is_a_match) {
            # This line matches our criteria
            #
            # The -v behavior is different.  We expect the connection 
+to be returned
            # if every line in the connection matches the expression n
+egatively, i.e.,
            # only if the expression is missing from every line in the
+ connection.
            if ($opt_v) {
                # Negative matching behavior - this line matched the e
+xpression
                # negatively.
                # In this case, we never put anything on @printing, be
+cause
                # we never print before we have collected all lines as
+sociated
                # with the connection.  Once we have all lines we can 
+print.
                #
                # If this is a close, check for a pending entry and pr
+int the connection
                if (($this_is_a_close) && ($this_is_pending)) {
                    print $pending{$connection_key};
                    print $log_line;
                    delete $pending{$connection_key};
                }
                # If this is a close with no pending entry, just print
+ the line.
                elsif ($this_is_a_close) {
                    print $log_line;
                }
                # If this is a not a close, add it to %pending or appe
+nd to its
                # existing key.
                else {
                    if ($this_is_pending) {
                        $pending{$connection_key} .= $log_line;
                    }
                    else {
                        $pending{$connection_key} = $log_line;
                    }
                }
            }
            else {
                # Standard matching behavior
                # Push the connection key onto @printing unless this i
+s a close
                unless ($this_is_a_close) {
                    verbose("Pushing $connection_key onto the printing
+ list.\n");
                    push(@printing,$connection_key);
                }
                #
                # If there is a pending entry for this connection prin
+t it and remove it
                if ($this_is_pending) {
                    print $pending{$connection_key};
                    delete $pending{$connection_key};
                }
                #
                # Print this line
                print $log_line;
            }
        }
        else {
            # This line does not match our criteria
            #
            if ($opt_v) {
                # Negative non-matching behavior
                # Push the connection key onto @skipping unless this i
+s a close
                unless ($this_is_a_close) {
                    verbose("Pushing $connection_key onto the skipping
+ list.\n");
                    push(@skipping,$connection_key);
                }
                # If this a pending connection, remove from the pendin
+g hash
                if ($this_is_pending) {
                    delete $pending{$connection_key};
                }
            }
            else {
                # Positive non-matching behavior
                # If this is a close on a pending connection, remove f
+rom the pending hash
                if (($this_is_a_close) && ($this_is_pending)) {
                    delete $pending{$connection_key};
                }
                # If this is not a close, but is pending, append to th
+e pending scalar
                elsif ($this_is_pending) {
                    $pending{$connection_key} .= $log_line;
                }
                # If this is not a close, and is not pending, create t
+he pending entry
                else {
                    $pending{$connection_key} = $log_line;
                }
            }
        }
    }
    # If we are doing negative matching, print any pending connections
    if ($opt_v) {
        foreach $pending_key (keys(%pending)) {
            print $pending{$pending_key};
        }
    }
}
# Do sort output type processing
if ($output_format eq 'sort') {
    my %connections;
    while (<ACCESS_LOG>) {
        my $log_line = $_;
        verbose("Processing line: $log_line");
        # Use a regular expression to parse the log line
        unless ($log_line =~ /^\[(\d+)\/([A-Za-z]+)\/(\d+)\:(\d+):(\d+
+):(\d+) ([+|-]\d+)\] conn\=([-]*\d+) op\=([-]*\d+) msgId\=([-]*\d+) \
+- (.+)$/) {
            # We didn't match the regex
            verbose("Line failed to parse:\n$log_line");
            next;
        }
        # We captured date, time, connection, and operation data
        my $day = $1;
        my $month = $2;
        my $year = $3;
        my $hour = $4;
        my $minute = $5;
        my $second = $6;
        my $tz_offset = $7;
        my $conn_number = $8;
        my $op_number = $9;
        my $msgid_number = $10;
        my $log_payload = $11;
        #
        # Date-time constraints
        # If we don't have a date yet, take it from the log
        if ($opt_S) {
            unless ($lower_epoch) {
                verbose("Lower bound not present.  Taking from log.\n"
+);
                $lower_epoch = timelocal($lower_sec,$lower_min,$lower_
+hour,$day,$mon_abbr{$month},$year - '1900');
            }
        }
        if ($opt_E) {
            unless ($upper_epoch) {
                verbose("Upper bound not present.  Taking from log.\n"
+);
                $upper_epoch = timelocal($upper_sec,$upper_min,$upper_
+hour,$day,$mon_abbr{$month},$year - '1900');
            }
        }
        # Calculate the current log time in epoch
        $current_epoch = timelocal($second,$minute,$hour,$day,$mon_abb
+r{$month},$year - '1900');
        # Is the current line in the correct range
        if ((($opt_S) && ($current_epoch < $lower_epoch)) || (($opt_E)
+ && ($current_epoch > $upper_epoch))) {
            next;
        }
        #
        # Here we decide what to do with the log entry.  We have a dat
+a structure,
        # %connections, which is a hash.  $conn_number is the key to %
+connections.
        #
        # Compute the connection key
        $connection_key = $conn_number;
        #
        # Set up a flag to say if this line is a connection open
        undef my $this_is_an_open;
        #
        # Set up a flag to say if this line is a result
        undef my $this_is_a_result;
        #
        # Set up a flag to say if this connection is closing
        undef my $this_is_closing;
        #
        # Set up a flag to say if this line is a close
        undef my $this_is_a_close;
        #
        # Check if the line is a result
        if (($log_payload =~ /connection from/) && ($op_number eq '-1'
+) && ($msgid_number eq '-1')) {
            $this_is_an_open = 'true';
            verbose("This line is a connection open.\n");
            $operation_key = 'OPEN';
        }
        # Check if the connection is closing
        elsif (($msgid_number eq '-1') && ($log_payload =~ /closing/))
+ {
            $this_is_closing = 'true';
            verbose("This connection is closing.\n");
            $operation_key = 'CLOSING';
        }
        # Check if the line is a close
        elsif (($op_number eq '-1') && ($msgid_number eq '-1') && ($lo
+g_payload eq 'closed.')) {
            $this_is_a_close = 'true';
            verbose("This line is a connection close.\n");
            $operation_key = 'CLOSED';
        }
        # Check if the line is an internal operation 
        elsif (($op_number eq '-1') || ($msgid_number eq '-1')) {
            verbose("This line is an internal operation.\n");
            $operation_key = 'INTERNAL';
        }
        # Normal operation key
        else {
            $operation_key = $op_number . ':' . $msgid_number;
        }
        # Check if the line is a result
        if (($log_payload =~ /^RESULT /)) {
            $this_is_a_result = 'true';
            verbose("This line is a result.\n");
        }
        # Check for server restart
        if (($this_is_an_open) && ((exists($connections{$connection_ke
+y})) || ($connection_key eq '0'))) {
            # Server restart detected
            verbose("Server restart detected.\n");
            my @conn_nums = sort { $a <=> $b } (keys %connections);
            foreach $conn_num (@conn_nums) {
                print "***********************************************
+*********************************\n";
                print $connections{$conn_num};
                delete $connections{$conn_num};
            }
            undef %connections;
        }
        # Aggregate by connection
        $connections{$connection_key} .= $log_line;
        verbose("Adding the line to connection $connection_key\n");
    }
    my @conn_nums = sort { $a <=> $b } (keys %connections);
    foreach $conn_num (@conn_nums) {
        print "*******************************************************
+***********************************\n";
        print $connections{$conn_num};
    }
}
# Do stat output type processing
#
## Date Computations
## Convert from standard date to epoch
## $epoch_date= timelocal($l_s,$l_mi,$l_h,$l_d,$mon_abbr{$l_mo},$l_y -
+ '1900');
## Convert from epoch to standard date
## ($p_s,$p_mi,$p_h,$p_d,$p_mo,$p_y) = localtime($epoch_date);
## $p_mo = $abbr_mon[$p_mo];
## $p_y += 1900;
#
if ($output_format eq 'stat') {
    # Set up stat interval
    $STAT_INTERVAL = defined($opt_I) ? $opt_I : '60';
    # Set up variables
    my $start_epoch;
    my $end_epoch;
    my $current_epoch;
    my $open_count = 0;
    my $close_count = 0;
    my $op_count = 0;
    my $result_count = 0;
    my $bind_count =0;
    my $srch_count = 0;
    my $add_count = 0;
    my $mod_count = 0;
    my $del_count = 0;
    my $etime_max = 0;
    my $etime_tot = 0;
    my $etime_avg;
    my $unindexed_count = 0;
    my $entries_returned = 0;
    my @rpt_header = qw(lg_date_end time_end conn_opn conn_cls tot_ops
+ tot_res tot_bind tot_srch tot_add tot_mod tot_del etime_av etime_mx 
+un_idx_s nentries); 
    printf("%-12s %8s  %8s  %8s  %8s  %8s  %8s  %8s  %8s  %8s  %8s  %8
+s  %8s  %8s  %8s\n",@rpt_header);
    while (<ACCESS_LOG>) {
        my $log_line = $_;
        verbose("Processing line: $log_line");
        # Use a regular expression to parse the log line
        unless ($log_line =~ /^\[(\d+)\/([A-Za-z]+)\/(\d+)\:(\d+):(\d+
+):(\d+) ([+|-]\d+)\] conn\=([-]*\d+) op\=([-]*\d+) msgId\=([-]*\d+) \
+- (.+)$/) {
            # We didn't match the regex
            verbose("Line failed to parse:\n$log_line");
            next;
        }
        # We captured date, time, connection, and operation data
        my $l_d = $1;
        my $l_mo = $2;
        my $l_y = $3;
        my $l_h = $4;
        my $l_mi = $5;
        my $l_s = $6;
        my $tz_offset = $7;
        my $conn_number = $8;
        my $op_number = $9;
        my $msgid_number = $10;
        my $log_payload = $11;
        #
        # Here we decide what to do with the log entry.
        #
        # Calculate the current log time in epoch
        $current_epoch = timelocal($l_s,$l_mi,$l_h,$l_d,$mon_abbr{$l_m
+o},$l_y - '1900');
        # Date-time constraints
        # If we don't have a date yet, take it from the log
        if ($opt_S) {
            unless ($lower_epoch) {
                verbose("Lower bound not present.  Taking from log.\n"
+);
                $lower_epoch = timelocal($lower_sec,$lower_min,$lower_
+hour,$l_d,$mon_abbr{$l_mo},$l_y - '1900');
            }
        }
        if ($opt_E) {
            unless ($upper_epoch) {
                verbose("Upper bound not present.  Taking from log.\n"
+);
                $upper_epoch = timelocal($upper_sec,$upper_min,$upper_
+hour,$l_d,$mon_abbr{$l_mo},$l_y - '1900');
            }
        }
        # Is the current line in the correct range
        if ((($opt_S) && ($current_epoch < $lower_epoch)) || (($opt_E)
+ && ($current_epoch > $upper_epoch))) {
            next;
        }
        # If this the first line of the log get the start time in epoc
+h
        unless ($start_epoch) {
            $start_epoch = $current_epoch;
            verbose("Starting stats interval at $start_epoch seconds f
+rom epoch\n");
            $end_epoch = $start_epoch + $STAT_INTERVAL - '1';
            ($end_s,$end_mi,$end_h,$end_d,$end_mo,$end_y) = localtime(
+$end_epoch);
            verbose("Current stats interval will end at the end of the
+ second $end_epoch seconds from epoch - $end_d\/$end_mo\/$end_y\:$end
+_h\:$end_mi\:$end_s\n");
        }
        # If the date is after the ending timestamp print stats and re
+set time markers and stats counters
        if ($current_epoch > $end_epoch) {
            if ($result_count > 0) {
                $etime_avg = $etime_tot / $result_count;
            }
            else {
                $etime_avg = '0';
            }
            printf("%02d\/%3s\/%4d  %02d\:%02d\:%02d  %8d  %8d  %8d  %
+8d  %8s  %8s  %8s  %8s  %8s  %8.3f  %8.1f  %8d  %8s\n", $end_d,$abbr_
+mon[$end_mo],$end_y + '1900',$end_h,$end_mi,$end_s,$open_count,$close
+_count,$op_count,$result_count,$bind_count,$srch_count,$add_count,$mo
+d_count,$del_count,$etime_avg,$etime_max,$unindexed_count,$entries_re
+turned);
            $start_epoch = $end_epoch + '1';
            verbose("Starting stats interval at $start_epoch seconds f
+rom epoch\n");
            $end_epoch = $start_epoch + $STAT_INTERVAL - '1';
            ($end_s,$end_mi,$end_h,$end_d,$end_mo,$end_y) = localtime(
+$end_epoch);
            verbose("Current stats interval will end at the end of the
+ second $end_epoch seconds from epoch - $end_d\/$end_mo\/$end_y\:$end
+_h\:$end_mi\:$end_s\n");
            $open_count = 0;
            $close_count = 0;
            $op_count = 0;
            $result_count = 0;
            $bind_count = 0;
            $srch_count = 0;
            $add_count = 0;
            $mod_count = 0;
            $del_count = 0;
            $etime_max = 0;
            $etime_avg = 0;
            $etime_tot = 0;
            $unindexed_count = 0;
            $entries_returned = 0;
        }
        # Parse the log payload and increment counters
        if (($log_payload =~ /^RESULT (.+) nentries=(\d+) etime=(.+?) 
+(.+)$/) || ($log_payload =~ /^RESULT (.+) nentries=(\d+) etime=(.+?)$
+/)) {
            $result_count++;
            $entries_returned += $2;
            $etime_tot += $3;
            if ($3 > $etime_max) { $etime_max = $3; }
            if ($4 =~ /notes\=U/) { $unindexed_count++; }
        }
        elsif ($log_payload =~ /^closed\.$/) {
            $close_count++;
        }
        elsif ($log_payload =~ /connection from /) {
            $open_count++;
        }
        elsif ($log_payload =~ /^ADD /) {
            $add_count++;
            $op_count++;
        }
        elsif ($log_payload =~ /^MOD /) {
            $mod_count++;
            $op_count++;
        }
        elsif ($log_payload =~ /^DEL /) {
            $del_count++;
            $op_count++;
        }
        elsif ($log_payload =~ /^SRCH /) {
            $srch_count++;
            $op_count++;
        }
        elsif ($log_payload =~ /^BIND /) {
            $bind_count++;
            $op_count++;
        }
        elsif ($log_payload =~ /^EXT /) {
            $op_count++;
        }
        else {
            unless ($log_payload =~ /^UNBIND$|^ABANDON |closing/) {
                verbose("Unknown log entry: $log_payload");
            }
        }
    }
    # Print last set of stats
    if ($result_count > 0) {
        $etime_avg = $etime_tot / $result_count;
    }
    else {
        $etime_avg = '0';
    }
    printf("%02d\/%3s\/%4d  %02d\:%02d\:%02d  %8d  %8d  %8d  %8d  %8s 
+ %8s  %8s  %8s  %8s  %8.3f  %8.1f  %8s  %8s\n", $end_d,$abbr_mon[$end
+_mo],$end_y + '1900',$end_h,$end_mi,$end_s,$open_count,$close_count,$
+op_count,$result_count,$bind_count,$srch_count,$add_count,$mod_count,
+$del_count,$etime_avg,$etime_max,$unindexed_count,$entries_returned);
}

# Do echo output type processing
if ($output_format eq 'echo') {
    while (<ACCESS_LOG>) {
        my $log_line = $_;
        verbose("Processing line: $log_line");
        # Use a regular expression to parse the log line
        unless ($log_line =~ /^\[(\d+)\/([A-Za-z]+)\/(\d+)\:(\d+):(\d+
+):(\d+) ([+|-]\d+)\] conn\=([-]*\d+) op\=([-]*\d+) msgId\=([-]*\d+) \
+- (.+)$/) {
            # We didn't match the regex
            verbose("Line failed to parse:\n$log_line");
            next;
        }
        # We captured date, time, connection, and operation data
        my $l_d = $1;
        my $l_mo = $2;
        my $l_y = $3;
        my $l_h = $4;
        my $l_mi = $5;
        my $l_s = $6;
        my $tz_offset = $7;
        my $conn_number = $8;
        my $op_number = $9;
        my $msgid_number = $10;
        my $log_payload = $11;
        # Date-time constraints
        # If we don't have a date yet, take it from the log
        if ($opt_S) {
            unless ($lower_epoch) {
                verbose("Lower bound not present.  Taking from log.\n"
+);
                $lower_epoch = timelocal($lower_sec,$lower_min,$lower_
+hour,$l_d,$mon_abbr{$l_mo},$l_y - '1900');
            }
        }
        if ($opt_E) {
            unless ($upper_epoch) {
                verbose("Upper bound not present.  Taking from log.\n"
+);
                $upper_epoch = timelocal($upper_sec,$upper_min,$upper_
+hour,$l_d,$mon_abbr{$l_mo},$l_y - '1900');
            }
        }
        # Calculate the current log time in epoch
        $current_epoch = timelocal($l_s,$l_mi,$l_h,$l_d,$mon_abbr{$l_m
+o},$l_y - '1900');
        # Is the current line in the correct range
        if ((($opt_S) && ($current_epoch < $lower_epoch)) || (($opt_E)
+ && ($current_epoch > $upper_epoch))) {
            next;
        }
        # Here we decide what to do with the log entry.
        print $log_line;
    }
}
# Close Access Log
if ($in_file) {
    verbose("Closing file $in_file\n");
    close(ACCESS_LOG) or die "Failed to close $in_file\: $!";
}
# A subroutine for verbose logging
sub verbose {
    my $verbose_msg = $_[0];
    if ($opt_d) {
        print STDERR $verbose_msg;
    }
}
[download]
Thank in advance.
Comment on Main logic of this script Download Code
Replies are listed 'Best First'.
Re: Main logic of this script (summary/abstract/outline) by Anonymous Monk on Mar 09, 2015 at 07:22 UTC
Hi, I only know Perl from an administration perspective, and trying to understand the below Perl Script primarily the main logic. Though I understand the syntax, but have challenges in understanding the flow. Could someone please explain me an overview of the this script ( Just a brief one would suffice) Here is the code : Its easy to do, start a new file, and copy/paste the beginning of every "paragraph", that means every branch (if/else/unless) or loop (for/foreach/while/until) Preserve the indentation level This is the logic Its like writing a good scientific essay/report/paper, simply reading the first and last paragraph and the first and last sentence of every intervening paragraph, will revel the main ideas of the paper Are you going to try this?	[reply]
Re^2: Main logic of this script (summary/abstract/outline) by Anonymous Monk on Mar 09, 2015 at 09:21 UTC
Here is my version, 1145 lines reduced to 208 comments Read more... (14 kB) This manual outline generation wouldn't be necessary if the original was skimmable code :)	[reply] [d/l]
Re^3: Main logic of this script (summary/abstract/outline) by Anonymous Monk on Mar 09, 2015 at 09:27 UTC
from 208 to 28 and 18 options var init stuff Read more... (1334 Bytes) meat and potatoes (read the log) Read more... (887 Bytes)	[reply] [d/l] [select]
Re^4: Main logic of this script (summary/abstract/outline) by Anonymous Monk on Mar 09, 2015 at 09:39 UTC
Re: Main logic of this script by Anonymous Monk on Mar 09, 2015 at 08:35 UTC
The code appears to be commented fairly well, but even without comments, `$usage`, and closing curlies, this script still has 575 lines (with ~100 lines of simple variable assignments). A "brief overview" might be difficult to achieve - what exactly do you need it for? If you're trying to learn something, it seems counterproductive to ask someone else to do the work for you. If you're just having trouble understanding some of the bits of this script, or you need to change something, you'll probably achieve more by asking more specific questions - what "challenges in understanding the flow" do you have?	[reply] [d/l]
Re^2: Main logic of this script by johnprince1980 (Initiate) on Mar 09, 2015 at 10:32 UTC
Hi, Thank you all for your comments and suggestions. - I do not understand the epoch part and why the comparision. - Also, I am not able to understand the flow of three data structures ( @skipping, @printing and %pending ). The script scans the access log of Sun Directory Server. Format is within for your reference. https://docs.oracle.com/cd/E19850-01/816-6699-10/logfiles.html Again thanks for your support.	[reply]
Re^3: Main logic of this script by Anonymous Monk on Mar 09, 2015 at 11:16 UTC
The epoch typically refers to Unix time, "a system for describing instants in time, defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970, not counting leap seconds." If you have two date/time values in different time zones, you can convert them both to Unix time and compare them easily, since they're both just a number of seconds in a common time system. As for `@skipping`, `@printing` and `%pending`, there's a big comment block describing those directly in the code, and individual comments on just about every operation on those variables, so I'm not really sure what further explanation you are looking for? Is there a specific operation you are not clear on?	[reply] [d/l] [select]
Re: Main logic of this script by ww (Archbishop) on Mar 09, 2015 at 13:56 UTC
TL;DR When you have a well-thought-out question to ask; do so. But, first, remove the irrelevant parts. You'll have better luck getting useful replies when you pose precise questions. *Questions containing the words "doesn't work" (or their moral equivalent) will usually get a downvote from me unless accompanied by:* code verbatim error and/or warning messages *a coherent explanation of what "doesn't work* actually means.**	[reply] [d/l]