Comparing/Completing Hashes

jjw92 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Comparing/Completing Hashes by BrowserUk (Patriarch) on Aug 14, 2010 at 00:16 UTC
Geez! What a lot of non-answers for a simple question. `#! perl -slw use strict; use Data::Dump qw[ pp ]; my @data = ( {"name"=>"joe","age"=>21,"weight"=>150,"height"=>"","sex"=>""}, {"name"=>"joe","age"=>"","weight"=>"","height"=>"6'0","sex"=>""}, {"name"=>"joe","age"=>"21","weight"=>"","height"=>"","sex"=>"male" +}, ); my %full; for my $p ( @data ) { $full{ $_ } \|\|= $p->{ $_ } for keys %$p; } pp \%full; __END__ c:\test>junk17 { age => 21, height => "6'0", name => "joe", sex => "male", weight => +150 }` [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP an inspiration; A true Folk's Guy	[reply] [d/l]
Re: Comparing/Completing Hashes by roboticus (Chancellor) on Aug 14, 2010 at 00:24 UTC
jjw92: Just a minor quibble: I wouldn't use empty strings to signify missing values. I use the default undef value for that (and in a hash, I'll often not insert the key for a missing value, either). In the general case, you may find that an empty string is a legitimate value, and then you wouldn't be able to tell the difference between a missing value and an intentional empty string. In the case you've presented, it appears that it wouldn't matter, as an empty string wouldn't normally be a valid value for age, height, weight or sex. But if you had a "nickname" field, for instance, a person with no nickname might be represented with an empty string. So if one of your data sources doesn't provide a nickname, stuffing the nickname with empty strings would imply to your database that those people have no nickname. Why would it matter? Think of three questions: (1) How many people in your database have nicknames? (2) How many people don't have nicknames? (3) How many people have we asked? Your method would allow you to answer question 1--mostly. But you couldn't legitimately answer questions 2 or 3. Using undef vs. and empty string would let you answer questions 2 and 3, as well as give you a better answer for question 1 (Out of X people that we've asked, Y have a nickname). Another way it could matter: In the problem you're solving now, you're merging information from multiple records. If one record is missing a data item, you can overlay the missing field from the other record. But what do you do if there's a conflict between two data items? You'll need to figure out how to resolve those differences. Throw in the fact that a nickname may be explicitly blank, and your code wouldn't be able to tell that the person no longer uses a nickname. Instead, they'll be stuck with their old nickname since you wouldn't be able to overlay it with an empty string--it would be treated as a missing value and ignored... Sorry for the long, rambling node. I haven't the time to make it short and concise. ...roboticus	[reply]
Re^2: Comparing/Completing Hashes by jjw92 (Novice) on Aug 14, 2010 at 00:42 UTC
I see the merit in this, but it isn't really applicable in this case. I just made these hashes/keys/values off of what I could think of on the spot. What I am actually dealing with is an array of hashes returned by slurping a .csv file. This leads me to the statement that every field with no data will be a void, "", and every piece of data will correspond to the correct header. I like the thought though, and it certainly would apply in many other cases. Thanks for the response.	[reply]
Re: Comparing/Completing Hashes by choroba (Cardinal) on Aug 13, 2010 at 23:26 UTC
Almost a one-liner: `my %h; @h{keys %$_} = values %$_ for @data;` [download] But, I'd rather not use it, because it does not check for a missing name, several different values for a value etc. But I will keep the sophisticated version secret to let you devise it yourself.	[reply] [d/l]
Re^2: Comparing/Completing Hashes by AnomalousMonk (Archbishop) on Aug 13, 2010 at 23:55 UTC
[I]t does not check for ... several different values for a [key] ... ... or for identical values for the same key in different anonymous hashes when those values are the empty string, as most values are in the example data.	[reply]
Re^3: Comparing/Completing Hashes by choroba (Cardinal) on Aug 13, 2010 at 23:59 UTC
Covered by "etc" ;)	[reply]
Re^2: Comparing/Completing Hashes by jjw92 (Novice) on Aug 14, 2010 at 00:02 UTC
When trying that, it only works for storing the last hash. If I knew more about what it was doing, maybe it is just a formatting issue? I'm not sure..	[reply]
Re^3: Comparing/Completing Hashes by choroba (Cardinal) on Aug 14, 2010 at 00:34 UTC
As AnomalousMonk noted, you would have to avoid setting the empty values. But as I noted, you should rather not use this approach.	[reply]
Re: Comparing/Completing Hashes by Anonymous Monk on Aug 13, 2010 at 23:10 UTC
Any input would be greatly appreciated. Thanks Great, you have a goal, now simply lay out the steps to accomplish that goal, translate that to code, and you're finished :) How (Not) To Ask A Question.	[reply]
Re: Comparing/Completing Hashes by AnomalousMonk (Archbishop) on Aug 13, 2010 at 23:21 UTC
Sorry for all the backslashed double-quotes: >perl -wMstrict -le "my @data = ( { \"name\" => \"joe\", \"age\" => 21, \"weight\" => 150, \"height\" => \"\", \"sex\" => \"\", }, { \"name\" => \"joe\", \"age\" => \"\", \"weight\" => \"\", \"height\" => \"6'0\", \"sex\" => \"\", }, { \"name\" => \"joe\", \"age\" => \"21\", \"weight\" => \"\", \"height\" => \"\", \"sex\" => \"male\", }, ); my %hash = map { my $hr = $_; map { $_, $hr->{$_} } grep $hr->{$_} ne '', keys %$hr } @data ; use Data::Dumper; print Dumper \%hash; " $VAR1 = { 'name' => 'joe', 'weight' => 150, 'sex' => 'male', 'height' => '6\'0', 'age' => '21' }; [download]	[reply] [d/l]
Re^2: Comparing/Completing Hashes by jjw92 (Novice) on Aug 13, 2010 at 23:32 UTC
I am still in the process of learning Perl. Could you please explain to me what is actually happening with the map{} process?	[reply]
Re^3: Comparing/Completing Hashes by planetscape (Chancellor) on Aug 14, 2010 at 06:13 UTC
See also: Map: The Basics HTH, planetscape	[reply]
Re^3: Comparing/Completing Hashes by AnomalousMonk (Archbishop) on Aug 13, 2010 at 23:59 UTC
See map.	[reply]