in reply to Find duplicate lines from the file and write it into new file.

#!/usr/bin/perl use strict; use warnings; while (<>) { chomp; $duplicates{$_}++; } foreach my $key (keys %duplicates) { if ($duplicates{$key} > 1) { delete $duplicates{$key}; print "$key\n"; } }
Expects to be run as script < inputfile > outputfile

Replies are listed 'Best First'.
Re^2: Find duplicate lines from the file and write it into new file.
by roboticus (Chancellor) on Jan 04, 2007 at 13:56 UTC
    Slightly more efficient....

    #!/usr/bin/perl -w use strict; use warnings; my %duplicates; while (<DATA>) { print if !defined $duplicates{$_}; $duplicates{$_}++; } __DATA__ The quick red fox jumped over the lazy brown dog. Time flies like an arrow, fruit flies like a banana. Time flies like an arrow, fruit flies like a banana. Now is the time for all good men to come to the aid of their party. The quick red fox jumped over the lazy brown dog. The quick red fox jumped over the lazy brown dog. Time flies like an arrow, fruit flies like a banana. Now is the time for all good men to come to the aid of their party. Now is the time for all good men to come to the aid of their party.

    --roboticus

      I used the script to find duplicate rows in my log file and it worked well. Thanks! However, it would be great if this script could be updated to display how may times the rows have been repeated as well. Please help!
        Pretty easy, the numbers are stored in the hash already. Just remove the print and add the following at the end of the script:
        print "$duplicates{$_}\t$_" for grep $duplicates{$_} > 1,keys %duplica +tes;