Need to compare 2 hash values

carlo1973 has asked for the wisdom of the Perl Monks concerning the following question:

Good day everyone!

I'm new to perl - when I say new - I mean novice to the nth extreme. If I had a towel I would put it over my head right about now so that the perl code I have been staring at for the past week couldn't hurt me any more. I do have programing experience but to be honest I haven't programed since something like 1993. Possibly even earlier than that.

I just finnished an IT course that was quite extensive although it was at a very fast pace. It lacked anything dealing with programing and only covered basic bash and batch scripting. I am now in my final week at a company for my internship, and since starting they have been getting me to work in of all things - PERL and MYSQL. Now the mysql I think I'm doing okay in but the perl I'm having more difficulty with.

The last project that I have been assigned as such is to work on a program that takes entries from a specific database, evaluates it for null or missing information and then displays it as a webpage. It also needs to compare the database to that of a csv file looking for information that may be missing from the database, again displaying the information in a webpage.

UPDATE: As discussed - the criteria has changed somewhat. no longer am I needed to evaluate for null values in the DB. I only need to compare the db to a csv file. Any entries that are in the csv but missing from the db should be reported. I've updated the code that I have here. I am not worried about displaying to a webpage yet. For now I will just be running it from the terminal for testing purposes.

Here is my code so far (Edited 05-27-09):

#!/usr/bin/perl
use strict;
#The Setup
use warnings;
use Text::CSV_XS;
use DBI;
my %conn_attrs = (RaiseError =>1, PrintError =>0, AutoCommit =>1);
my $dbh = DBI->connect('DBI:mysql:pstsize;host=localhost');
my $sth = $dbh->prepare("SELECT DISTINCT nbname,ipaddr from sizehist o
+rder by nbname ");
#Read the DB into a hash array
$sth->execute();
my $dbData = {};
while (my $db_ref = $sth->fetchrow_hashref ())
{
        $dbData->{$db_ref->{nbname}}->{$db_ref->{ipaddr}} = 0;
}
#Read CSV file into hash array using CSV_XS module
open my $io, "<", "use_me.txt" or die "use_me.txt: $!";
my $csv = Text::CSV_XS->new ({binary => 1, eol => "\n"});
while (my $row = $csv->getline ($io))
{
        if ( exists($dbData->{$row->[0]}->{$row->[1]}) )
        {
#                Mark the db rows that match.
        $dbData->{$row->[0]}->{$row->[1]} = 1;
        }
}
close $io;
#                Build output arrays.
my @match = ();
my @dontmatch = ();
for my $nbname (sort(keys(%$dbData)))
{
        for my $ipaddr (sort(keys(%{$dbData->{$nbname}})))
        {
        for my $nbname (sort(keys(%{$dbData->{$ipaddr}})))
                {
                if ( $dbData->{$nbname}->{$ipaddr} == 1 )
                {
                push (@match, [$nbname, $ipaddr]);
                } else {
                push (@dontmatch, [$nbname, $ipaddr]);
                        }
                }
        }
}
# Output:
print "These db rows match:\n";
map { print "$_->[0],$_->[1]\n" } @match;
print "\n";
print "These db rows don't match:\n";
map { print "$_->[0],$_->[1]\n" } @dontmatch;
exit;
[download]

I really hope I can get help with this. I'm really struggling with it. I ended up purchasing two books with the hopes that I would have this done by now (MySQL Developer's Library book - fourth edition, and 5th edition of Learning Perl).

Thank you to all the Monks, guru's, genies, and higher powers for the perl world in advance. This lowely newb is unworthy but hopes to get some assistance anyways :D

Carlo

Comment on Need to compare 2 hash values Download Code

Replies are listed 'Best First'.
Re: Need to search through a hashref array to find empty, null, or otherwise translucent key values. by Rodster001 (Pilgrim) on May 26, 2009 at 22:41 UTC
One thing I noticed at first glance is you have the "use strict" line commented out. If you are new to perl one of the first things you want to do is get in the good habit of using "strict" and "warnings". If you are getting errors and commenting those lines out to get around them, don't. Find out why you are getting the errors and fix those. It's a good road to be on to solve most of your problems (may or may not apply to this particular one, I'm just talking good habit here).	[reply]
Re: Need to search through a hashref array to find empty, null, or otherwise translucent key values. by GrandFather (Saint) on May 27, 2009 at 00:29 UTC
The bits you have commented out may be more important than many of the bits you left in. Most important (as already mentioned) is that you have commented out the 'use strict;' line. Why? `#my file_ref = ();` is an interesting commented line because it doesn't compile uncommented. However later in the code file_ref is used as a hash, although the name suggests a reference to something. Later `#my %file_ref;` occurs which looks like it's related and compiles correctly uncommented - why is it commented out? Uncommenting the %file_ref declaration fixes the compile error that the use strict line was probably commented out for. The code you present is not novice code. On the other hand there are large swathes of code missing. Where did the original code come from and how did it get into its current state? What are the explicit problems that you are having? True laziness is hard work	[reply] [d/l] [select]
Re^2: Need to search through a hashref array to find empty, null, or otherwise translucent key values. by carlo1973 (Initiate) on May 27, 2009 at 18:20 UTC
Sorry for the confusion of the coments in my code. What I was doing was researching things on the net - trying my best to solve the issue myself before bothering anyone. What I would would be, if I found something that worked "sort-of" then I would leave it in the code but coment it out, till I found something that performed better. This way I wouldn't loose anything and I could go back and fix things. It was for my own reference as to what I've changed. Unfortunatly no one else can read my train of thought lol I should have removed those lines before posting them ^^	[reply]
Re: Need to search through a hashref array to find empty, null, or otherwise translucent key values. by toolic (Bishop) on May 26, 2009 at 23:33 UTC
You did not disclose any specific details regarding your problem. Do you get error or warning messages? Do your variables contain unexpected data? Since we can not run your code, let us know. Since you are new to Perl, perhaps the Basic debugging checklist can help. By the way, the name of the language is Perl (not PERL).	[reply]
Re: Need to search through a hashref array to find empty, null, or otherwise translucent key values. by whakka (Hermit) on May 27, 2009 at 04:59 UTC
This looks less like a Perl problem than a clarification of what you're trying to do (although I would heed advice re un-commenting useful code and perhaps deleting some confusing code). You're printing a .csv file to STDOUT of the tuples you're selecting but you say you want to report a) missing users and names and b) see if that missing info is located in a .csv file. I would re-write it by loading in the .csv data first with two hashes: one for users and the other for names; both use ipaddress as the key. Then when you loop through the results of your SELECT statement, if you have missing values on either field you can simply do a lookup in the appropriate hash on the ipaddress. To report, use a subroutine that spits out html and call it when you find missing data. If it were up to me I would have just one report with additional fields indicating the (perhaps) non-missing .csv data. Narrowing the problem down though is the first thing, then asking more specific questions as they arise.	[reply]
Re: Need to search through a hashref array to find empty, null, or otherwise translucent key values. by scorpio17 (Canon) on May 27, 2009 at 15:49 UTC
Here's how I would do it: use strict; use Text::CSV_XS; use DBI; my $csv_inputfile = "input.csv"; # change me my %data; # hash for storing csv data my $csv = Text::CSV_XS->new; open my $fh, '<', $csv_inputfile or die "can't open $csv_inputfile: $! + \n"; while (my $line = <$fh>) { $line =~ s/^\s+//; # strip leading whitespace next unless $line; # ignore blank lines # assume CSV file looks like: ip, user, nbname if ( $csv->parse($line) ) { my ($ip, $user, $nbname) = $csv->fields(); next unless $ip; $data{$ip} = { USER => $user \|\| '', NBNAME => $nbname \|\| '', }; } else { my $error = $csv->error_diag(); print "Parse error: line $. : $error.\n"; } } close $fh; my $user = "db_user"; # change me my $pass = "db_pass"; # change me my $dbh = DBI->connect( "dbi:mysql:pstsize", $user, $pass, {RaiseError => 1, AutoCommit => 1}, ); unless ($dbh) { die "Could not get database handle."; } my $sth = $dbh->prepare( select_data() ); eval { $sth->execute(); }; if ($@) { die "Database error: ", $dbh->errstr; } print "<html><body><pre>\n"; while ( my ($ip, $user, $nbname) = $sth->fetchrow ) { unless ($user) { print "DB missing USER for $ip."; if ($data{$ip}{'USER'}) { print ".. but CSV file has value: $data{$ip}{'USER'}."; } print "\n"; } unless ($nbname) { print "DB missing NBNAME for $ip."; if ($data{$ip}{'NBNAME'}) { print ".. but CSV file has value: $data{$ip}{'NBNAME'}."; } print "\n"; } } print "</pre></body></html>"; $sth->finish; $dbh->disconnect; sub select_data { return <<" END_SQL"; SELECT DISTINCT ipaddr, user, nbname FROM sizehist ORDER BY ipaddr END_SQL } [download] Notes: Read the CSV file first, storing the data into a hash. The IP address is used as the hash key. Then each value in the hash is a reference to another hash, containing the data fields for a given record. Move the SQL stuff into its own subroutine. As you pull data from the DB, check the values, then compare against the data in your hash (from the CSV file). I put simple HTML tags around the output. There's a lot of room here to make it more pretty. I'd suggest maybe using HTML::Template (that way all the HTML is outside of your script). I added lots of error checking - that stuff's important, never skimp on it! I made some assumptions about your file formats, etc. So you may need to make some minor changes, but hopefully this is closer to what you need. Good luck!	[reply] [d/l]
Re^2: Need to search through a hashref array to find empty, null, or otherwise translucent key values. by carlo1973 (Initiate) on May 27, 2009 at 17:17 UTC
Thank you so much! I never expected such an overwelming response to my querry. :) It was asked about the problem I'm trying to solve - well in a nutshell we have a system that utilizes ldap and a homebrew solution that collects data on outlook pst files for all the users in the company. The homebrew program was supposed to be checking user name, ip address, netbios name, pst size and location, and a few other categories and storing them into a database (pstsize). We recently noticed that this database has been missing information lately. Some pc's basically have not reported back with the information so it's not getting recorded. In order to determine who's systems are not reporting, I've been asked to come up with a solution as part of my internship. I've made a bash file that would make a stripped down version of the master host file (use_me.txt) in csv format. We chose the master host file for the reason that everything is logged that goes through the main server. This unfortunatly includes our network printers and camera's but I managed to filter those out. The idea is to compare the entries in the database to that of use_me.txt. Those entries that are in use_me.txt but not in the database will be reported. Anything else can be ignored. Originally we also noticed a lot of NULL's in the username field of the DB. We were going to filter against it till we realised that it was because of new systems just being initialised and no username was provided for them. We removed those from the database all together and are not worried about those any longer. Unfortunatly we didn't realise it till after closing, and this is the first chance I've had to update my querry to everyone. There are a few variables that we haven't taken into account until today. One being something as benign (but possible issue in perl) of case sensitivity from the data supplied back from the database vs that of the csv file. Another possible varriable to contend with is that some of the entries in the csv do not follow nbname, ip address schema. For example: carlo,192.168.1.210 docindex2,docindex.com. However I don't think we're going to worry to much about the reports showing items similar to docindex2 as an item not found in the db. Its something that can be addressed later, perhaps with more filtering. But for now we'll leave them in Thanks again to everyone who posted back! It's much appreciated :) Oh as for why I commented out #strict - well yeah I was trying to cheat <slapping fingers with ruller> I will fix that right away :)	[reply]