Log File Parsing

TStanley has asked for the wisdom of the Perl Monks concerning the following question:

I have been asked to develop a script that parses the output of a command that reads a binary log file(and outputs it in text).
The output is comprised of several records, and each record is in the following format:

Packet ID (hex)     0xF040            Origin           DOS 7452
Date (mm/dd/yyyy)   09/07/2001        Qualifier        ACS
Time                15:45:45.90       Terminal ID      109
Device ID (hex)     0x2D0             Application ID   SALESAPP
Catalog Code        DOS               Function ID(hex) 0x800E
Status              253               Severity         0
Item code 772788600047 was not found in the PLU file
------------------------------------------------------------

The above format will remain the same, but the overall number of records will change on a day to day basis. What I need to do is grab the records for just a specific day, then break that down to just a few specific records based upon the information in the last line before the separator. Below is some preliminary code that I have done:

#!/usr/bin/perl -w
use strict;

#This is the command I need to run
my @Command = `ulread`;

my $line;
my @stripped=();

# Get rid of the separators and put everything into
# another array

foreach $line(@Command){
  if($line=~/^-+$/){}else{
    push @stripped,$line;
  }
}

# The end of the array will have 4 lines that are not part 
# of any record (it is actually statistical info for the
# entire display). These lines can be removed.

for(1..4){ pop @stripped; }
[download]

Does anyone here have any better ideas than this? Thanks.

TStanley
--------
There's an infinite number of monkeys outside who want to talk to us
about this script for Hamlet they've worked out
-- Douglas Adams/Hitchhiker's Guide to the Galaxy

Comment on Log File Parsing Download Code

Replies are listed 'Best First'.
Re: Log File Parsing by MZSanford (Curate) on Sep 11, 2001 at 19:32 UTC
well, i am Mr. Read-Binary-Data-in-Perl-using-Pack-and-unpack, but sometimes you do not have the spec, or it cannot be reverse engineered (easily). I tend to use `open(FH,"command \|")` instead of back ticks, so i would do something more like _{(untested code ahead)}: `#!/usr/bin/perl -w use strict; my @data = (); $/ = '------------------------------------------------------------\n'; open(PIPE,'ulread \|') \|\| die "could not fork() : $!\n"; while (my $line = <PIPE>) { ## total block push @data, $line } close(PIPE) \|\| die "close error : $!\n"; for(1..4){ pop @data; }` [download] _{speling champ of tha claz uf 1997} -- MZSanford	[reply] [d/l] [select]
Re: Re: Log File Parsing by TStanley (Canon) on Sep 11, 2001 at 19:44 UTC
What I basically need to know is how to go about parsing the information from the records themselves, once I get the information read into the array. An idea that came to me was to make an array of hashes, with each hash being a single record. I then could go through each hash, looking for a specific value in the date, putting those records into a separate array, then printing out that array, but I am not quite sure how I would go about doing that. TStanley -------- There's an infinite number of monkeys outside who want to talk to us about this script for Hamlet they've worked out -- Douglas Adams/Hitchhiker's Guide to the Galaxy	[reply]
Re: Log File Parsing by cLive ;-) (Prior) on Sep 11, 2001 at 20:44 UTC
If you want to whack them in an array, do the split as in above answer. Then, yes, stick into a hash. Here's a simple to read version (this can be stripped further if needed). my (%hash,$count); for (@array) { # increase hash key $count++; # use 'x' quantifier to make it readable # I've made assumptions about data expected here # you may need to amend m/Packet ID (hex) \s+ (\S+) \s+ Origin \s+ (\S+)\s+ Date (mm/dd/yyyy) \s+ (\S+) \s+ Qualifier \s+ (\w+)\s+ Time \s+ (\S+) \s+ Terminal ID \s+ (\d+)\s+ Device ID (hex) \s+ (\w+) \s+ Application ID \s+ (\w+)\s+ Catalog Code \s+ (\w+) \s+ Function ID(hex) \s+ (\w+)\s+ Status \s+ (\d+) \s+ Severity \s+ (\d+)\s+ (.*)/x; $hash{$count}{'packet_id'} = $1; $hash{$count}{'origin'} = $2; $hash{$count}{'date'} = $3; $hash{$count}{'qualifier'} = $4; $hash{$count}{'time'} = $5; $hash{$count}{'terminal_id'} = $6; $hash{$count}{'device_id'} = $7; $hash{$count}{'application_id'} = $8; $hash{$count}{'catalogue_code'} = $9; $hash{$count}{'function_id'} = $10; $hash{$count}{'status'} = $11; $hash{$count}{'severity'} = $12; $hash{$count}{'error_message'} = $13; } # then run through and take action on hash for (keys %hash) { if ($hash{$_}{'error_message'} =~ /whatever/) { # do something ... } } [download] HTH cLive ;-)	[reply] [d/l]
Re: Log File Parsing by TStanley (Canon) on Sep 12, 2001 at 00:31 UTC
Something I didn't make clear in the original post, is that each record consists of 8 separate lines (including the separator line), so when I read the output into the array, and print out a single record, it would look like: `Element 0: Packet ID (hex) 0xF040 Origin DOS +7452 Element 1: Date (mm/dd/yyyy) 09/07/2001 Qualifier ACS Element 2: Time 15:45:45.90 Terminal ID 109 Element 3: Device ID (hex) 0x2D0 Application ID SALE +SAPP Element 4: Catalog Code DOS Function ID(hex) 0x80 +0E Element 5: Status 253 Severity 0 Element 6: Item code 772788600047 was not found in the PLU file Element 7: -------------------------------------------------` [download] Here is what I have for code so far: #!/usr/bin/perl -w use strict; my $searchdate=`date +%m\/%d\/%C%y`; my @Command=`ulread`; my $line; my @stripped=(); my %messages=("Cacheman"=>"0","sopup.sh"=>"0","switch_sh.sh"=>"0", "128"=>"0","NetBIOS"=>"0","Manual"=>"0","Beginning"=>"0"); foreach $line(@Command){ if($line=~/^-+$/){}else{ push @stripped,$line; } } for(1..4){ pop @stripped; } my (%hash,$count); for(@stripped){ $count++; m/Packet ID $hex$ \s+ (\S+) \s+ Origin \s+ (\S+)\n Date $mm\/dd\/yyyy$ \s+ (\S+) \s+ Qualifier \s+ (\w+)\n Time \s+ (\S+) \s+ Terminal ID \s+(\d+)\n Device ID $hex$ \s+ (\w+) \s+ Application ID \s+(\w+)\n Catalog Code \s+ (\w+) \s+ Function ID $hex$ \s+ (\w+)\n Status \s+ (\d+) \s+ Severity \s+ (\d+)\n(.*)/xi $hash{$count}{'packet_id'}=$1; $hash{$count}{'origin'}=$2; $hash{$count}{'date'}=$3; $hash{$count}{'qualifier'}=$4; $hash{$count}{'time'}=$5; $hash{$count}{'terminal_id'}=$6; $hash{$count}{'device_id'}=$7; $hash{$count}{'application_id'}=$8; $hash{$count}{'catalog_code'}=$9; $hash{$count}{'function_id'}=$10; $hash{$count}{'status'}=$11; $hash{$count}{'severity'}=$12; $hash{$count}{'message'}=$13; } open (RPT,">Logview.rpt")\|\|die"Can't open: $!\n"; foreach my $key(keys %hash){ foreach my $msg(keys %messages){ if($hash{$key}{'message'}=~/$msg/ & $hash{$key}{'date'}=~/$searchd +ate/){ ## print out the entire record to the file } } } close RPT; [download] I wait with breathless anticipation for someone to help solve this interesting riddle :-) TStanley -------- There's an infinite number of monkeys outside who want to talk to us about this script for Hamlet they've worked out -- Douglas Adams/Hitchhiker's Guide to the Galaxy	[reply] [d/l] [select]