Sunnmann has asked for the wisdom of the Perl Monks concerning the following question:

I am in need of some help on changing the way I am used to parsing some data. I have no problem getting the data I want from a file, but what I want this time is for the script to tell me what systems are missing a certain string of data. Once again I am in over my head and cannot figure this one out. I have tried to use regular expressions with an “if” loop to no avail to get the output I would like. Here is some sample data.

File one: One.log (input)
Computer1 KB893756 KB896422 KB896423 KB899588 KB899591 KB921883 Computer2 KB893756 KB896422 KB896423 KB899591 KB917159 KB921883 Computer3 KB893756 KB896422 KB899588 KB899591 KB917159 KB921883


This is the parsed data I am able to get myself from a much larger file. What I would like to do now is to check each grouping against a list and have it tell me which patch a computer does not have in it’s list. The list of patches on each computer is not set to a certain amount, it can and will change in what is does and does not have.

File two: Two.txt (input, patch list)
KB893756 KB893803 KB896422 KB896423 KB899588 KB899591 KB911927 KB921883


Could I get some help on how someone would go about doing this? If you do not feel like doing it for me, please get me started with some specific pointers if you are willing.

I need this outputted to a new file, and the patch list does not have to be a file if that is easier, but other patches will be added to the document as we need them there.

Replies are listed 'Best First'.
Re: Finding out what computer does NOT have certain data
by NetWallah (Canon) on Sep 01, 2006 at 04:36 UTC
    Another option is to use List::Compare :
    Untested pseudocode ..
    use List::Compare; my @required_patches= qw(patch numbers that are required); my @current_machine_patches= ....; my $lc = List::Compare->new( {lists => [\@required_patches, \@current_machine_patches], accelerated => 1, unsorted => 1 }); my @missing_patches = $lc->get_Lonly;

         "For every complex problem, there is a simple answer ... and it is wrong." --H.L. Mencken

Re: Finding out what computer does NOT have certain data
by Fletch (Bishop) on Sep 01, 2006 at 04:04 UTC

    Look in perlfaq4 for "How do I compute the difference of two arrays?".

Re: Finding out what computer does NOT have certain data
by graff (Chancellor) on Sep 01, 2006 at 04:17 UTC
    Let's suppose you read your "file two" first, like this:
    open( P, "<", "Two.txt" ) or die "Two.txt: $!"; my @patches = sort <P>; chomp @patches; close P; my %patch_id; my $seq_num = 0; $patch_id{$_} = $seq_num++ for ( @patches );
    That assigns a numeric ID (from 0 to N-1) to each of the N patch names listed in that file, in sorted order. Now read "file one" like this:
    my %comp_patch; # will be a hash of arrays my $comp_name; # will store most recently seen computer name open( C, "<", "One.log" ) or die "One.log: $!"; while (<C>) { chomp; if ( /^KB\d+/ ) { # this is a patch name if ( not exists( $patch_id{$_} )) { warn "Computer $comp_name has unknown patch: $_\n"; next; # (might want to do something else here) } $comp_patch{$comp_name}[$patch_id{$_}] = $_; } elsif ( /^\S+/ ) { # not a patch name, must be a computer name $comp_name = $_; } }
    For each computer, you now have an array of patches, with array elements 0 through N-1. But if a given computer did not have a particular patch, the corresponding array index in the HoA structure will be undef. So that is what you check for, to see which patches are missing from each computer:
    for my $cname ( sort keys %comp_patch ) { for my $pnum ( 0 .. $#patches ) { print "$cname lacks $patches[$pnum]\n" unless ( defined( $comp_patch{$cname}[$pnum] )); } }
    (untested)
Re: Finding out what computer does NOT have certain data
by McDarren (Abbot) on Sep 01, 2006 at 04:56 UTC
    This is what I would do:
    • Create a HOH, with the first level keys being your computer names, and the second level keys being the patches installed on each computer.
    • Assign the list of required patches to an array.
    • Then iterate through the array using a for loop, and within that check whether the current element exists as a secondary key within your hash for each computer.

    I hope that makes sense. Here is some code to demonstrate:

    #!/usr/bin/perl -w use strict; my %computers; my $computer; my @required_patches = qw( KB893756 KB893803 KB896422 KB896423 KB899588 KB899591 KB911927 KB921883 ); while (my $line = <DATA>) { chomp($line); if ($line =~ /^KB\d{6}/) { $computers{$computer}{$line}++; } else { $computer = $line; } } for my $patch (@required_patches) { for my $computer (keys %computers) { if (!exists $computers{$computer}{$patch}) { print "$computer is missing patch $patch\n"; } } } __DATA__ Computer1 KB893756 KB896422 KB896423 KB899588 KB899591 KB921883 Computer2 KB893756 KB896422 KB896423 KB899591 KB917159 KB921883 Computer3 KB893756 KB896422 KB899588 KB899591 KB917159 KB921883
    Which prints:
    Computer3 is missing patch KB893803 Computer2 is missing patch KB893803 Computer1 is missing patch KB893803 Computer3 is missing patch KB896423 Computer2 is missing patch KB899588 Computer3 is missing patch KB911927 Computer2 is missing patch KB911927 Computer1 is missing patch KB911927

    Cheers,
    Darren :)

      So I get my Computer information from a script I have created that gets the information and parses through another file to get the list I gave you guys (of course it is only a snipet of my list because I thought putting 150+ computers worth of data was a bit much).

      In your code, you have the line, while (my $line = <DATA>) {

      The <DATA> would that be me openining the file using a statement beforehand that looks as such?

      open (DATA, "One.txt") or die "One.txt: $!"; while (my $line = <DATA>) { chomp($line); if ($line =~ /^KB\d{6}/) { $computers{$computer}{$line}++; } else { $computer = $line; } }
        I have it and it works now. The end rsult is an excel sheet that outputs the information in alphabetical order. SWEET!

        Though I still want to learn more so I am trying the rest of the suggestions here. Have to see what i can and cannot understand.
Re: Finding out what computer does NOT have certain data
by Old_Gray_Bear (Bishop) on Sep 01, 2006 at 04:48 UTC
    For a small number of machines (<= 10):
    1. Assign IDs to the machines by increasing powers of 2 (machine1 =>1; machine2=>2; machine3=>4;etc)
    2. Build a hash keyed by the patch-id, whose value is the sum of the machine-ids who have that patch applied. (for example -- $patchs{$patch_id} += $machine_id;)
    3. Sort the hash by value, in descending order, and display in the medium of your choice.
    4. The patches at the top of the list are on more machines that those lower down.
    5. Sum the machine-id's you assigned in step one (S = 1 + 2 + 4 + ... +2^^n).
    6. For each patch-id in the hash whose value is less than S, there is a unique combination of powers of 2 the add up to the difference. Those machines do not have this patch.
    7. Apply the patches
    8. Repeat this process until there are no patches left to apply. (This is to account for the case which __never__ happens in real life -- PatchA requires Patches B through F, and PatchC wasn't on your original list...)
    For more than ten machines, I'd split them up into groups of less than ten each, and run each group separately. And, once I had each group sorted out and patched, I make a final run against a group created by taking one from each of my original groups.

    This is not going to be a simple process; and if you have a lot of machines, it will be basically a non-starter. In that case, I throw up my hands, build me a "golden machine" with all of the patches applied and then clone its system disk to all the other machines. Come to think of it, that might be the best policy from the git-go.

    Good Luck & Good Hunting

    ----
    I Go Back to Sleep, Now.

    OGB

      @ Gray Bear

      Oh how I wish it could be that easy. Our problem is that we have over 150 machines that need to be checked and while we have images for each type of system (oh ya we have 4 different computer types that are used so that means 4 different images), we have a problem when a system goes down and we have patched but not taken an image yet that things start to get convoluted.

      So the best way i see right now is a system such as this where I check each system against a list and see what systems need what patch.
Re: Finding out what computer does NOT have certain data
by GrandFather (Saint) on Sep 01, 2006 at 07:25 UTC

    A solution that doesn't use the second file, but builds the property table while reading the first file. This assumes that all required properties are present somewhere in the first file.

    use strict; use warnings; my %properties; my @computers; while (<DATA>) { chomp; next if ! length; push @computers, $_; while (<DATA>) { chomp; last if ! length; push @{$properties{$_}}, $computers[-1]; } } for my $computer (@computers) { my @missing = grep {! grep {$_ eq $computer} @{$properties{$_}}} k +eys %properties; $computer = [$computer, [@missing]]; } print join "\n", map {"$_->[0]: @{$_->[1]}"} @computers; __DATA__ Computer1 KB893756 KB896423 KB899588 KB899591 KB921883 Computer2 KB893756 KB896422 KB896423 KB899591 KB917159 KB921883 Computer3 KB893756 KB896422 KB899588 KB899591 KB917159 KB921883

    Prints:

    Computer1: KB896422 KB917159 Computer2: KB899588 Computer3: KB896423

    Note: computer one's data was edited slightly from OP's data so that it misses two properties.


    DWIM is Perl's answer to Gödel
Re: Finding out what computer does NOT have certain data
by radiantmatrix (Parson) on Sep 01, 2006 at 15:13 UTC

    Hm, I'd simply store the patch list in a hash and check on the fly.

    #!/usr/bin/perl use strict; use warnings; my $patchlist_file = shift @ARGV; my %patchlist; # read patchlist open my $PATCHLIST, '<', $patchlist_file or die "Can't read $patchlist_file: $!"; while (<$PATCHLIST>) { s/^\s+|\s+$//sg; #trim head/tail whitespace incl. newline $patchlist{$_} = 1; } close $PATCHLIST; #now check files against the patchlist foreach my $file (@ARGV) { open my $FILE, '<', $file or do { warn "Unable to read $file: $!"; next; }; local $/ = "\n\n"; #records separated by double-newline while (<$FILE>) { my ($computer, @patches) = split("\n", $_, 2); my %patch = map { $_ => 1 } @patches; my @missing; # see if each patch is there for (keys %patchlist) { push @missing, $_ unless $patch{$_} } # report printf '%s is missing %d patches.', $computer, scalar @missing; print (scalar @missing ? ' They are: '.join(',',@missing) : ''), +"\n"; } close $FILE; }
    <radiant.matrix>
    A collection of thoughts and links from the minds of geeks
    The Code that can be seen is not the true Code
    I haven't found a problem yet that can't be solved by a well-placed trebuchet