Re: finding duplicate data

You could also do it in a compact command line manner
To find only the duplicate enteries
perl -ne 'print if $h{$_}++' filename
To find unique data
perl -ne 'print unless $h{$_}++' filename
Of course , you could simply do cat filename | uniq at the shell prompt for the second case.
HTH,
chimni

Comment on Re: finding duplicate data Select or Download Code

Replies are listed 'Best First'.
Re^2: finding duplicate data by Roy Johnson (Monsignor) on Jan 21, 2004 at 15:39 UTC
Useless use of cat, and a misunderstanding of uniq (it only looks for duplicate adjacent lines). Instead, use `sort -u filename` [download] to find unique data (though the perl solution would be doing less work, and would not scramble the line order). To list the duplicate entries only once, `perl -ne 'print if $h{$_}++ == 1' filename` [download] The PerlMonk `tr///` Advocate	[reply] [d/l] [select]