I just finished a month long project that added a suite of related scripts to our "toolbox" arsenal at work. Of course, the second I was done and put it into production - I received a thousand feature requests from co-workers. Some of these requests were easy to accomodate, while others require re-designing and won't be done for a long time to come. One of them seems deceptively simple I was hoping you all could help me with.
One of the scripts monitors a directory for transient files (race condition) containing "bad" information and moves them to another directory and writes a log of it. Since this needed to be super duper ultra fast, I wrote as little to the log file as possible. For instance:
Now, many instances of the above script are running on multiple directories - so there are multiple logs. I wrote a companion script to parse those logs and put it in human readable format. For instance:
The script, which you can readmore here....
#!/usr/bin/perl -w use strict; use Getopt::Std; use Time::Local; use POSIX qw(strftime); $|++; my %Opt; my @Conns; &GetArgs(); &GetConns(); &GetLogs(); sub GetArgs { my $Usage = qq{Usage: $0 [options] -h : This help message. -c : Specific connector - default is to list all connectors. -d : Specific direction - default is to list all directions -n : Trap name - default is to list all names -t : Time in stamp format mm/dd/yy-hh:mm or mm/dd/yy +<stamp> - show entries created after specified stamp If time is not given, defaults to 23:59 -<stamp> - show entries created before specified stamp If time is not given, default to 00:00 =<stamp> - show entries created on specified stamp If time is not given it is ignored (all day) <stamp>+-<stamp> - show entries created between specif +ied stamps If time is not given on first stamp, 00:00 is +used If time is not given on second stamp, 23:59 is + used Note: This includes the day(s) specified -s : Size of files caught in bytes +<size> - show entries with files larger than specifie +d size -<size> - show entries with files smaller than specifi +ed size =<size> - show entries with files equal to specified s +ize <size>+-<size> - show entries with files between speci +fied sizes } . "\n"; getopts( 'hc:d:n:t:s:', \%Opt ) or die "$Usage"; die "$Usage" if $Opt{h}; if ($Opt{d}) { $Opt{d} = lc($Opt{d}); die "$Usage" if ($Opt{d} ne "in" && $Opt{d} ne "out" && $Opt{d} ne + "both"); } } sub GetConns { open (CONNECTORS,"/var/wt400/conf/_wtd.cfg") or die "\nUnable to ope +n connector file!\n"; while (<CONNECTORS>) { next unless ($_ =~ /^unit="(.*)"/); my $Conn = lc($1); next if ($Conn eq "ins" || $Conn eq "ins2" || $Conn eq "_wtd"); push @Conns , $Conn; } close (CONNECTORS); if ($Opt{c}) { $Opt{c} = lc($Opt{c}); if (grep /\b$Opt{c}\b/ , @Conns) { @Conns = $Opt{c}; } else { die "\nInvalid connector - $Opt{c} !\n"; } } } sub GetLogs { my @Logs; foreach my $Conn (@Conns) { my @Directions; if ($Opt{d}) { @Directions = $Opt{d}; } else { @Directions = (qw(in out both)); } foreach my $Dir (@Directions) { push @Logs , "/var/spool/wt400/log/$Conn/trap_${Dir}.log" if (-r + "/var/spool/wt400/log/$Conn/trap_${Dir}.log" && -s _); } } unless (@Logs) { die "\nUnable to find any logs!\n"; } else { while (my $File = shift @Logs) { my($mon, $day, $year, $hour, $min); open(LOG,$File); LINE: while (my $Line = <LOG>) { chomp $Line; my @Fields = split " " , $Line; if ($Opt{n}) { next unless (lc($Opt{n}) eq lc($Fields[3])); } if ($Opt{t}) { $Opt{t} =~ s/\s+//; my $Stamp1; my $Stamp2; if ($Opt{t} =~ /^\+(.*)/) { ($mon, $day, $year, $hour, $min) = split ?[-/:]? , $1; ($hour,$min) = (23,59) unless ($hour && $min); $Stamp1 = timelocal(0, $min, $hour, $day, $mon - 1, $year + + 100); next unless ($Fields[0] > $Stamp1); } elsif ($Opt{t} =~ /^\-(.*)/) { ($mon, $day, $year, $hour, $min) = split ?[-/:]? , $1; ($hour,$min) = (00,00) unless ($hour && $min); $Stamp1 = timelocal(0, $min, $hour, $day, $mon - 1, $year + + 100); next unless ($Fields[0] < $Stamp1); } elsif ($Opt{t} =~ /^\=(.*)/) { ($mon, $day, $year, $hour, $min) = split ?[-/:]? , $1; ($hour,$min) = (00,00) unless ($hour && $min); $Stamp1 = timelocal(0, $min, $hour, $day, $mon - 1, $year ++ 100); ($hour,$min) = (23,59) unless ($hour && $min); $Stamp2 = timelocal(0, $min, $hour, $day, $mon - 1, $year ++ 100); next unless ($Fields[0] >= $Stamp1 && $Fields[0] <= $Stamp +2 ); } elsif ($Opt{t} =~ /^(.*)\+\-(.*)/) { ($mon, $day, $year, $hour, $min) = split ?[-/:]? , $1; ($hour,$min) = (00,00) unless ($hour && $min); $Stamp1 = timelocal(0, $min, $hour, $day, $mon - 1, $year ++ 100); ($mon, $day, $year, $hour, $min) = split ?[-/:]? , $2; ($hour,$min) = (23,59) unless ($hour && $min); $Stamp2 = timelocal(0, $min, $hour, $day, $mon - 1, $year ++ 100); next unless ($Fields[0] >= $Stamp1 && $Fields[0] <= $Stamp +2 ); } } if ($Opt{s}) { $Opt{s} =~ s/\s+//; if ($Opt{s} =~ /^\+(.*)/) { next unless ($Fields[2] > $1); } elsif ($Opt{s} =~ /^\-(.*)/) { next unless ($Fields[2] < $1); } elsif ($Opt{s} =~ /^\=(.*)/) { next unless ($Fields[2] == $1); } elsif ($Opt{s} =~ /^(.*)\+\-(.*)/) { next unless ($Fields[2] >= $1 && $Fields[2] <= $2 ); } } if ($File =~ /^.*\/(.*)\/trap_(.*)\.log/) { my $Conn = $1; my $Dir = $2; my $Time = strftime("[%x-%X]",localtime($Fields[0])); print "$Time $Conn $Dir $Fields[3] $Fields[1] $Fields[2]\n"; } } } } }
allows the option of just looking at one log or if no option is specified, all the logs at the same time as well as parsing for only specific information.
PROBLEM
My co-workers would like to have the logs interleaved (sorted chronologically) if they are displaying all the logs at once. This seems incredibly easy since the first column is a timestamp and would automatically sort chronologizally. The problem is there are two columns in the output (see below) that are dynamically generated based off of filename/path.
RAW: 1044007259 do15505x 467 PaulRidge 1044022188 do15667s 876 Tom-Snow 1044029052 do15854j 3228 BCorcoran FORMATED: [01/31/03-11:41:28] DIR1 out MarkLester doqnh6y5 10300 [01/31/03-16:28:20] DIR1 out BrianSmith doavr564 8353 [01/31/03-16:38:12] DIR1 out MarkLester doavr5g4 9663 [01/30/03-23:02:08] DIR2 out PaulRidge do15347q 2394 [01/30/03-23:02:08] DIR2 out PaulRidge do15347t 492 Note: The raw and formated are samples and do not represent the same d +ata.
Since my code currently reads each file in one at a time, it is able to dynamically generate these two columns. To sort all the results chronologically means that I would have to read them all in first (they could get quite large), perform the sort, and print out the output. I thought of the following alternative options:
What I would like to do is open (LOGS,"sort @Logs |");, but then I would lose the filename/path and wouldn't be able to generate those two columns.
Any advice (besides my regular expressions in parsing my data)?
Thanks in advance, L~R
In reply to Log parsing by timestamp dilema by Limbic~Region
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |