find unique values from a file by comparing two columns

sreeragtk86 has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am trying to get the unique values from the combination of columns 2 & 3 (: as delim) .

eg; below is the sample file

D1:111:92111
D2:112:92111
D3:111:92111
D4:112:92111
D5:111:90222
D6:112:90222
D7:111:90222
D8:112:90222
[download]

The output should be

92111 has unique values 111,112

90222 has unique values 111,112

Regards

Comment on find unique values from a file by comparing two columns Download Code

Replies are listed 'Best First'.
Re: find unique values from a file by comparing two columns by AnomalousMonk (Archbishop) on Sep 26, 2014 at 11:39 UTC
One way: c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my @recs = qw( D1:111:92111 D2:112:92111 D3:111:92111 D4:133:92111 D5:111:90222 D6:112:90222 D7:111:90222 D8:112:90222 ); ;; my %uniques; for my $rec (@recs) { my (undef, $field1, $field2) = split /:/, $rec; ++$uniques{$field2}{$field1}; } dd \%uniques; ;; for my $field2 (reverse sort keys %uniques) { my $field1s = join ',', sort keys %{ $uniques{$field2} }; print qq{'$field2' has unique values '$field1s'}; } " { 90222 => { 111 => 2, 112 => 2 }, 92111 => { 111 => 2, 112 => 1, 133 => 1 }, } '92111' has unique values '111,112,133' '90222' has unique values '111,112' [download]	[reply] [d/l]
Re^2: find unique values from a file by comparing two columns by sreeragtk86 (Initiate) on Sep 26, 2014 at 12:13 UTC
Thanks a ton AnomalousMonk! This works fine :)	[reply]
Re: find unique values from a file by comparing two columns by pme (Monsignor) on Sep 26, 2014 at 16:02 UTC
Hi sreeragtk86, If you store values as hash keys then the hash itself will keep the uniqness. `#!/usr/bin/perl -w use strict; use Data::Dumper; my %hash; while (<DATA>) { chomp; my @row = split(':'); $hash{$row[2]}->{$row[1]} = 1; } print Dumper( \%hash ) . "\n"; __DATA__ D1:111:92111 D2:112:92111 D3:111:92111 D4:112:92111 D5:111:90222 D6:112:90222 D7:111:90222 D8:112:90222` [download] Regards	[reply] [d/l]
Re: find unique values from a file by comparing two columns by Laurent_R (Canon) on Sep 26, 2014 at 18:43 UTC
Some good answers have already been provided, but I would just like to add: in general, when you think unique, you should very often think hash.	[reply]
Re^2: find unique values from a file by comparing two columns by Anonymous Monk on Sep 26, 2014 at 23:03 UTC
Interesting :) `[perldoc://unique]` unique -> perlfaq4#How can I remove duplicate elements from a list or array? Use a hash. When you think the words "unique" or "duplicated", think "hash keys".	[reply] [d/l]