in reply to Re: Parsing logs and bookmarking last line parsed
in thread Parsing logs and bookmarking last line parsed

@Marshall- thank you very much for your insight. What turned out initially as a way to summarize this data turned into a bigger project than I anticipated, but felt it was a good assignment for me to learn perl.

You definitely nailed what I needed which was keying off the backup set and extracting some attributes associated with it.

My goal of this output was to produce a delimited file as you can see in my print statements. Having to prefix the attributes with a numbering system seemed to help me with sorting it. The file is needed as input to an html table. Anyways, i'll review your pointers and code. Thanks again.

  • Comment on Re^2: Parsing logs and bookmarking last line parsed

Replies are listed 'Best First'.
Re^3: Parsing logs and bookmarking last line parsed
by Marshall (Canon) on Aug 19, 2010 at 20:53 UTC
    Wow! You've taken on a pretty difficult "first assignment"! And you've gotten a heck of a lot further than most could have done! There are some "quirks" about this that make some of the details difficult.

    I posted some more code for you. Take a look and see "what is missing/not right".

    Update: I see why you did: my $BckupKey="5-Duration"; Don't do this "5-" "decoration" of the hash key. There are better albeit advanced techniques for specifying the sort order. Concentrate on getting what data you need and then you can get help here about how to get it appear in the "right" order.

    Below is just one example of a special sort order. A more robust thing would take into account what happens when I haven't specified the order of some input string vs another. I am just saying that advanced sorting is one of the things that Perl is very good at.

    #!/usr/bin/perl -w use strict; my @special_order = ("x", "b", "a", "y"); my $i =0; my %sort_order = map{$_ => $i++}@special_order; my @array = ("a", "x", "y", "b"); @array = sort @array; print "Regular Sort: @array\n"; @array = ("a", "x", "y", "b"); @array = sort by_order @array; print "Special Sort: @array\n"; sub by_order { my $a_order = $sort_order{$a}; my $b_order = $sort_order{$b}; $a_order <=> $b_order } __END__ prints: Regular Sort: a b x y Special Sort: x b a y
      @Marshall - thanks for all your help. I just noticed the code you posted earlier from a reply. Seeing the results using dumper was neat. That's exactly what I needed was the breakdown of value/pair data for each set. I am looking now at the code you provided for sorting and see how I can fit that in with your other example. I'd like to get this working and use it as a reference for me of good code and seeing another way of writing it helps me learn :-) My end result would be a delimited file. I'll try it out and see if i can string it together. thanks!

      This is the data string I was constructing with my script. I know its screwy with the hashing and all...

      1-Server=server1.domain.com;2-Logdate=Thu Aug 19 2010;3-BackupSet=back +up.set1_lvm;4-StartTime=06:00:03;5-Duration=00:56:53;6-Size=72.04 GB; +7-Status=Succeeded; 1-Server=server1.domain.com;2-Logdate=Thu Aug 19 2010;3-BackupSet=back +up.set2_lvm;4-StartTime=00:00:04;5-Duration=01:56:35;6-Size=187.24 GB +;7-Status=Succeeded; 1-Server=server1.domain.com;2-Logdate=Thu Aug 19 2010;3-BackupSet=back +up.set3_lvm;4-StartTime=23:00:05;8-Status=Unsuccessful;
      @Marshall - I'm looking at your parse script you provided. Can you show me in the regex, how to get only specific attributes? In other words, I'm not interested in splitting all of the data, but interested in these keys for example (backup-set, backup-date, backup-time, ERROR) with the flexibility to add or remove more if needed.

      Also can you show me how to how to print the keys out in order with a delimited format for example (name=value;name=value;) Thank you again.

        I wouldn't fiddle around with the parse_line() subroutine. It has a job which is to "parse lines". You don't want to modify it so that it only parses "some of the lines". Try to write code so that each function has a single clear task.

        If you want a subset of the data then either change the code immediately after the line parsing (it only saves things which are of the form p=v now). Or filter what you need at the final "print it" stage, which is what I did below in some more code for you. I decided to do it that way so that when problems happen latter, you have all the p=v pairs and you can use Dumper to see what is going on. There is no need here to optimize to reduce memory storage.

        What to print is specified by an array (important_keys) and you can modify that as needed. If you want a special sort order, then modify the code I gave you earlier. BUT I would recommend against that in favor of a simple normal alphabetic sort. In my experience this usually works out best for "human readable" reports - humans scan sorted lists easily. Note: the hash important_order isn't used for a sort, but is used as an easy way to determine if some parm is in the "desired" list or not.

        You've got some work to do, but I think you are on the right path.

        As an update, to avoid sorting the final parms altogether, iterate through the important_keys array and print that key,value pair if it exists. The order in the array determines the "sort order" automatically (i.e. no sorting at all). Of course you could just sort the important_keys array in method 2 to get same as method 1, but you should know how to generically just print all the keys in this type of hash structure, so I left method 1 in.

        #!/usr/bin/perl -w use strict; use Data::Dumper; my %backups; my @important_keys = qw(backup-status backup-set backup-date backup-ti +me ERROR); # this is how to translate the array into something that could # be used for a special sort order, but I don't think you will # need to do that. my $i=1; my %important_order = map{$_ => $i++} @important_keys; while (<DATA>) { next if (/^\s*$/); #skip blank lines chomp; my ($date, $backupset , $parm , $value) = parseline($_); if ($value) { $backups{$backupset}{$parm} = $value; } } #print Dumper \%backups; foreach my $set (sort keys %backups) { print "\n*** backup set: $set ***\n"; #method 1 foreach my $parm (sort keys %{$backups{$set}}) { printf " %-15s = %s \n", $parm, $backups{$set}{$parm} if $important_order{$parm}; } #method 2 # foreach my $parm (@important_keys) # { # printf " %-15s = %s \n", $parm, $backups{$set}{$parm} # if $backups{$set}{$parm}; # } } sub parseline { my $line = shift; my ($date, $rest) = $line =~ m/(^.*\d{4}):(.*)/; my ($backupset, $msg) = split(/backup:INFO:/, $rest); $backupset =~ s/:\s*$//; #trimming some unwanted thing like ':' is + ok $backupset =~ s/^\s*backup\.//; #more than one step is just fine to +o! my ($parm, $value) = $msg =~ m/\s*(.*)=\s*(.*)\s*/; $parm ||= $msg; #if match doesn't happen these will be undef $value ||=""; #so this trick makes sure that they are defined. return ($date, $backupset, $parm, $value); } =method 1 prints: *** backup set: set1_lvm *** backup-date = 20100816000003 backup-set = backup.set1_lvm backup-status = Backup succeeded backup-time = 01:59:04 *** backup set: set2_lvm *** backup-date = 20100815200003 backup-set = backup.set2_lvm *** backup set: set2_lvm_lvm *** backup-status = Backup succeeded backup-time = 04:33:12 =method 2 prints: *** backup set: set1_lvm *** backup-status = Backup succeeded backup-set = backup.set1_lvm backup-date = 20100816000003 backup-time = 01:59:04 *** backup set: set2_lvm *** backup-set = backup.set2_lvm backup-date = 20100815200003 *** backup set: set2_lvm_lvm *** backup-status = Backup succeeded backup-time = 04:33:12 =cut
        data_segment follows (its same as previous posts)