Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I'm a newbie looking for some guidance on best practices for this specific task. I have strung up some code that is working based on help from others perl folks. I am still trying to identify my style. I learn by examples and I'd rather be more verbose with my code so that I can understand it.

I am still trying to get my arms around understanding the use of hashes. I have code here that is using hash ref. I'm unclear on how to print this out in a specific format.

Purpose of my script: 1) Get the hostname. The hostname is prefixed to the server logs. (ex. server1.log, server2.log, server2.log) 2) For each date given, either manually or systematically, give me the backup set name 3) For every backup set name, grab either one of these variables if they appear (backup-size, backup-time, backup-status, or ERROR if its given for that backup set) 4) Generate a datafile with these values delimited in whatever format. This datafile will be used later as feed to another system.

Current issues: My current script uses hashes, again from help. I'm still confused on hash of hashes and its use. I didn't go the array route, cause I'm not sure how to set that up and from what I read, accessing the hash would be easier? I'm not sure. I am having problems formatting the output in an ordered fashion. So, i'm looking at a structure like this:
server1: MyDate (today's date) --> MyBackupSet --> Backup Attribute = Backup Value server2: MyDate (today's date) --> MyBackupSet --> Backup Attribute = Backup Value
Current perl code:
use strict; use warnings; use File::Basename; use Data::Dumper; my %MyItems; my $ARGV ="/var/log/server1.log"; my $mon = 'Aug'; my $day = '06'; my $year = '2010'; while (my $line = <>) { chomp $line; print "Line: $line\n" if debug; if ($line =~ m/(.* $mon $day) \d{2}:\d{2}:\d{2} $year: ([^:]+):bac +kup:/) { my $server = basename $ARGV, '.log'; my $BckupDate="$1 $year"; my $BckupSet =$2; print "$BckupDate ($BckupSet): " if debug; $MyItems{$server}{$BckupSet}->{'MyLogdate'} = $BckupDate; $MyItems{$server}{$BckupSet}->{'MyDataset'} = $BckupSet; $MyItems{$server}{$BckupSet}->{'MyHost'} = $server; #$MyItems{$server}{$BckupSet}->{'MyServer'} = $server; if ($line =~ m/(ERROR|backup-size|backup-time|backup-status)[: +=](.+)/) { my $BckupKey=$1; my $BckupVal=$2; $MyItems{$server}{$BckupSet}->{$BckupKey} = $BckupVal; print "$BckupKey=$BckupVal\n" if debug; } } } print Dumper(%MyItems);
Output from Dumper:
$VAR1 = 'server1'; $VAR2 = { 'abc1.mil.mad' => { 'ERROR' => ' If you are sure is not +running, please remove the file and restart ', 'MyLogdate' => 'Fri Aug 06 2010', 'MyHost' => 'server1', 'MyDataset' => 'abc1.mil.mad' }, 'abc2.cfl.mil.mad' => { 'backup-size' => '187.24 GB', 'MyLogdate' => 'Fri Aug 06 2010', 'MyHost' => 'server1', 'backup-status' => 'Backup succeeded +', 'backup-time' => '01:54:27', 'MyDataset' => 'abc2.cfl.mil.mad' }, 'abc3.mil.mad' => { 'backup-size' => '46.07 GB', 'MyLogdate' => 'Fri Aug 06 2010', 'MyHost' => 'server1', 'backup-status' => 'Backup succeeded', 'backup-time' => '00:41:06', 'MyDataset' => 'abc3.mil.mad' }, 'abc4.mad_lvm' => { 'backup-size' => '422.99 GB', 'MyLogdate' => 'Fri Aug 06 2010', 'MyHost' => 'server1', 'backup-status' => 'Backup succeeded', 'backup-time' => '04:48:50', 'MyDataset' => 'abc4.mad_lvm' } };
Sample output source file (server log file):
Fri Aug 06 00:00:04 2010: abc2.cfl.mil.mad:backup:INFO: Fri Aug 06 00:00:05 2010: abc2.cfl.mil.mad:backup:INFO: backup-set=abc +2.cfl.mil.mad Fri Aug 06 00:00:05 2010: abc2.cfl.mil.mad:backup:INFO: backup-date=20 +100806000004 Fri Aug 06 00:00:05 2010: abc2.cfl.mil.mad:backup:INFO: Fri Aug 06 00:48:54 2010: abc4.mad_lvm:backup:INFO: backup-size=422.99 + GB 0: abc4.mad_lvm:backup:INFO: flush-logs-time=00:00:00 Fri Aug 06 00:48:54 2010: abc4.mad_lvm:backup:INFO: backup-time=04:48: +50 Fri Aug 06 00:48:54 2010: abc4.mad_lvm:backup:INFO: backup-status=Back +up succeeded Fri Aug 06 00:48:54 2010: abc4.mad_lvm:backup:INFO: Backup succeeded Fri Aug 06 00:48:54 2010: abc4.mad_lvm:backup:INFO: PHASE START: Runni +ng post backup plugin Fri Aug 06 00:48:55 2010: abc4.mad_lvm:backup:INFO: PHASE END: Running + post backup plugin Fri Aug 06 00:48:55 2010: abc4.mad_lvm:backup:INFO: PHASE START: Clean +up Fri Aug 06 00:48:55 2010: abc4.mad_lvm:backup:INFO: PHASE END: Cleanup Fri Aug 06 00:48:55 2010: abc4.mad_lvm:backup:INFO: END OF BACKUP
Format I would like to try and create if possible (datafile):
MyHost=>server1;MyLogdate=>Fri Aug 06 2010;MyDataset=>abc2.cfl.mil.mad +;backup-time=>Fri Aug 06 2010;backup-status=>Backup succeeded MyHost=>server2;MyLogdate=>Fri Aug 06 2010;MyDataset=>abc4.mad_lvm;bac +kup-status=>Backup succeeded

Replies are listed 'Best First'.
Re: Printing out a hash in specified format
by dasgar (Priest) on Aug 08, 2010 at 05:48 UTC

    Sounds like you're having trouble understanding the concept of arrays and hashes. Without that understanding, coding with either could get confusing.

    An array is simply an ordered list of data (string, integers, etc.). Each element in that list has an integer index. Some languages start their indexing at 1, but Perl starts at 0. When you're wanting an ordered list and are not concerned what each element is representing, arrays work great.

    Let's say you wanted to store employee data in array. You could index 0 as employee ID, index 1 as last name, and so on. The challenge here is that you'll have to remember the what index represents what. In this case, hashes work better.

    In a hash, you have a list of data. However, instead of identifying the elements by a numerical index, you use a key, which is basically a string. In the employee example above, you can use keys such as Employee_ID, Last_Name, First_Name, etc. Now in your code you're using text that clearly identifies what the element is.

    With both hashes and arrays, you're dealing with pairs: an identify and it's data. In one case, the identify is an integer (arrays) and the other uses text.

    What I've described so far has been single dimension implementations. In any array or hash, the data component of a element pair can be a new array or hash. If you have hash(es) inside of an array, that's called an array of hashes. If you have array(s) inside of a hash, that's called a hash of arrays.

    Hopefully that helps you understand the general concepts of arrays and hashes. Once you've got the general concepts down, check out perlfaq4 for more information on how to use arrays and hashes in Perl. You can check out perllol and perldsc for more information about using more complex array and hash structures.

Re: Printing out a hash in specified format
by biohisham (Priest) on Aug 08, 2010 at 09:18 UTC
    Kudos for the neat example and the very organized code you write, keep up that practice. You've come to the right place, here you will learn by example, and not only text-book examples, we're talking real Perl situations met daily by monks scattered all over the globe here. I learnt a lot by hanging around the place.

    "I am still trying to get my arms around understanding the use of hashes."
    Make sure you've identified with the various ways to create/access and manipulate hashes and arrays before going the extra miles into references and advanced data structures. That way it becomes easier and quicker to grab the concept quickly. The Monastery is brimful of timeless resources for you to look at, for example : And then you have %perldoc perlcheat from your command line to serve as a quick primer.. Have a nice Perl journey...


    Excellence is an Endeavor of Persistence. A Year-Old Monk :D .
      Noted. I am taking my time sorting this out and looking at examples. I appreciate the input.
Re: Printing out a hash in specified format
by ikegami (Patriarch) on Aug 08, 2010 at 05:52 UTC
    Why build the hash at all? Just print the values instead of putting them in the hash.
      That's what I thought too..But if I'm reading line by line and need to bucketize a set of data for every backup set, then wouldn't the array or hash be useful for this?
        Hashes are good for grouping (using the SQL term) if that's what you mean. Specifically, a hash of server of hashes backup sets of arrays of status records.
        $backup_status{$server}{$backup_set}[$backup_set_idx]{$field_name}
Re: Printing out a hash in specified format
by LanX (Saint) on Aug 08, 2010 at 10:38 UTC
    hmm something like this?

    my @order=qw/MyHost MyLogdate MyDataset backup-time backup-status/; for my $BckupSet_hr ( values %{ $MyItems{$server} } ) { print "$_ => ",$BckupSet_hr->{$_},";\t" for (@order); print "\n"; }

    Thats untested code, please take it as a base if it doesn't fit.

    BTW: better use print Dumper(\%MyItems); instead of print Dumper(%MyItems);!

    Cheers Rolf