in reply to finding duplicate data


You could also do it in a compact command line manner
To find only the duplicate enteries
perl -ne 'print if $h{$_}++' filename
To find unique data
perl -ne 'print unless $h{$_}++' filename
Of course , you could simply do cat filename | uniq at the shell prompt for the second case.
HTH,
chimni

Replies are listed 'Best First'.
Re^2: finding duplicate data
by Roy Johnson (Monsignor) on Jan 21, 2004 at 15:39 UTC
    Useless use of cat, and a misunderstanding of uniq (it only looks for duplicate adjacent lines). Instead, use
    sort -u filename
    to find unique data (though the perl solution would be doing less work, and would not scramble the line order).

    To list the duplicate entries only once,

    perl -ne 'print if $h{$_}++ == 1' filename

    The PerlMonk tr/// Advocate