dnickel has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I have a question about extracting data from a file. The
file is an output file from HFNETCHK that was run
on all my servers.

FILE SNIPPET


----------------------------
Server1.dom.com
----------------------------

WINDOWS 2000 SERVER SP2

<t> WARNING MS01-022 Q296441
Patch NOT Found MS02-001 Q311401
Patch NOT Found MS02-006 Q314147
Patch NOT Found MS02-012 Q313450
Patch NOT Found MS02-013 Q300845
Patch NOT Found MS02-014 Q313829


Internet Information Services 5.0

Patch NOT Found MS02-001 Q311401
Patch NOT Found MS02-012 Q313450


Internet Explorer 5.5 SP2

Patch NOT Found MS02-005 Q316059
Patch NOT Found MS02-009 Q318089


----------------------------
Server2.dom.com
----------------------------


WINDOWS 2000 SERVER SP2
WARNING MS01-022 Q296441
Patch NOT Found MS02-001 Q311401
Patch NOT Found MS02-006 Q314147
Patch NOT Found MS02-013 Q300845
Patch NOT Found MS02-014 Q313829


Internet Explorer 5.5 SP1

Patch NOT Found MS02-005 Q316059
Patch NOT Found MS02-009 Q318089


PROBLEM


I have a file like the one above but consists of 30 servers. I want to parse this file to make a spreadsheet of the data. I have a function call that will grab all of my server names and store them in @Servernames. I am haveing problems taking a server name and parseing the file matching the servername and storing the security information with the associated Q# for IE, OS. etc... This needs to iterate through the whole server list

The structure would look something like this
Server=>Win2000=> Q#
IE => Q#
SQL => Q#
Thanks guys and any help would be much appreciated

Replies are listed 'Best First'.
Re: Tricky File Parsing
by tachyon (Chancellor) on Apr 08, 2002 at 03:16 UTC

    Showing code is a good idea but as I was bored at the time this should get you started:

    use Data::Dumper; # get you datafile into a scalar variable (sample put in __DATA__) my $data = join '', <DATA>; my $split_pat = qr/-{28}\n([\w\d\.-]+)\n-{28}\n*/; my @data = split /$split_pat/, $data; shift @data; %servers = @data; for my $server (keys %servers) { my %progs = split /\n{2,}/, $servers{$server}; $servers{$server} = \%progs; for my $prog ( keys %{$servers{$server}} ) { my @data = split /\n+/, $servers{$server}->{$prog}; $servers{$server}->{$prog} = \@data; } } print Dumper \%servers; __DATA__ Dumper output: $VAR1 = { 'Server1.dom.com' => { 'WINDOWS 2000 SERVER SP2' => [ 'WARNI +NG MS01-022 Q296441', 'Patch + NOT Found MS02-001 Q311401', 'Patch + NOT Found MS02-006 Q314147', 'Patch + NOT Found MS02-012 Q313450', 'Patch + NOT Found MS02-013 Q300845', 'Patch + NOT Found MS02-014 Q313829' ], 'Internet Information Services 5.0' = +> [ + 'Patch NOT Found MS02-001 Q311401', + 'Patch NOT Found MS02-012 Q313450' + ], 'Internet Explorer 5.5 SP2' => [ 'Pat +ch NOT Found MS02-005 Q316059', 'Pat +ch NOT Found MS02-009 Q318089' ] }, 'Server2.dom.com' => { 'WINDOWS 2000 SERVER SP2' => [ 'WARNI +NG MS01-022 Q296441', 'Patch + NOT Found MS02-001 Q311401', 'Patch + NOT Found MS02-006 Q314147', 'Patch + NOT Found MS02-013 Q300845', 'Patch + NOT Found MS02-014 Q313829' ], 'Internet Explorer 5.5 SP1' => [ 'Pat +ch NOT Found MS02-005 Q316059', 'Pat +ch NOT Found MS02-009 Q318089' ] } };

    Effort in reply is proportional to effort exerted by seeker of wisdom.

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Tricky File Parsing
by joealba (Hermit) on Apr 08, 2002 at 05:10 UTC
    When munging data like this to extract the stuff that you want, just keep breaking your problem up into smaller pieces (as long as you can break the pieces up reliably!). Then when you can't break things up in a perfectly reliable way, you need to alter your data (or your thinking) until you can get it into a format that will allow you to reliably break it up.

    For example, you have one big file with data about several hosts. So, use a peachy keen regexp to break the big file into chunks of data pertaining to each host -- (i.e. splitting the data on the dashes / hostnames / dashes parts). Then, each host's chunk of data has information about software packages. So, split that chunk into its pieces -- (i.e. splitting on the 2 blank lines to get each piece of software). Finally, you reach some slightly inconsistent data. Sometimes you have the software name, a blank line, then patch level information. But for one, there is no blank line separator. SO, rather than splitting on the blank line, you should just pull the first non-blank line from each record to get the software package name. All that's left is your patch info.

    If you break it down like this, data munging becomes easy! As long as you know enough about regular expressions, which you can find from many of the good Perl books or the links stephen mentions.


    Or you can just use tachyon's code above -- and hope that every time you need something like this, someone will be just as helpful. :)
      Thank you for your post. You made the task look alot clearer Cheers
Re: Tricky File Parsing
by stephen (Priest) on Apr 08, 2002 at 03:02 UTC

    I'm afraid I --ed this node because it sounds like an attempt to get us to do the work for you.

    You don't give us enough information to answer questions. You say you are "having problems taking a server name and parsing the file matching the servername and storing the security information with the associated Q# for IE, OS. etc..." What sort of problems are you having? What have you done to solve the problems, and what has been the result?

    In terms of starting points, I'd look at regular expressions, references, and complex data structures, since together those will give you what you need. Also you might want to consider Parse::RecDescent.

    stephen

      It might have came across that way but that is far from the truth. I a fairly new and am having problems with associative arrays. Do you think the -7 is fair for my posting. Maybe I will just look elswhere for perl help. Cheers