Perl is ideal for comparing strings. That being said, I would either do the comparisons on strings, or I'd do all of the work in the database, if I wanted to deal with it as records.
We assume we have a function get_records( $sql ) that returns a single scalar for each record, given the sql string, you'd use something like the following general logic: (you may have to adjust to deal with your database's concatenation operator, and select a deliminator that doesn't occur in your fields)
my $fields = join "||'|'||", qw( field1, field2, field3); my %set1 = map { $_ => 1 } get_records ( "SELECT $fields FROM table WHERE data_set = 'set1'"); my %set2 = map { $_ => 1 } get_records ( "SELECT $fields FROM table WHERE data_set = 'set2'"); my @set2not1 = grep { ! exists($set1{$_}) } keys %set2; my @set1not2 = grep { ! exists($set2{$_}) } keys %set1;
You can then split the records back out into their individual fields. But there's no reason to go and load all of the records into memory, when you already have them in the database, which is made specifically to handle dealing with records. If you're using a database that can take subqueries, you can do the relationship in one statement. If you're not, you need some temporary tables or views --
CREATE VIEW set1 AS SELECT field1,field2,field3 FROM table WHERE data_set = 'set1'; CREATE VIEW set2 AS SELECT field1,field2,field3 FROM table WHERE data_set = 'set2';
You can then use a non-equijoin in Oracle, or a right join in mySQL, or whatever you have in your particular database to get a list of records that don't have a correlation.
Oh -- and you might want to look over the documentation for DBI, because fetchrow_array() returns a list for each element, which you're then evaluating in scalar context, and would get the count of the fields for each item. You probably wanted fetchrow_arrayref()
In reply to Re: Compare records from different data sets
by jhourcle
in thread Compare records from different data sets
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |