Re: Find duplicate lines from the file and write it into new file.

#!/usr/bin/perl

use strict;
use warnings;

while (<>) {
    chomp;
    $duplicates{$_}++;
}

foreach my $key (keys %duplicates) {
    if ($duplicates{$key} > 1) {
        delete $duplicates{$key};
        print "$key\n";
    }
}
[download]

Expects to be run as script < inputfile > outputfile

Comment on Re: Find duplicate lines from the file and write it into new file. Select or Download Code

Replies are listed 'Best First'.
Re^2: Find duplicate lines from the file and write it into new file. by roboticus (Chancellor) on Jan 04, 2007 at 13:56 UTC
Slightly more efficient.... #!/usr/bin/perl -w use strict; use warnings; my %duplicates; while (<DATA>) { print if !defined $duplicates{$_}; $duplicates{$_}++; } __DATA__ The quick red fox jumped over the lazy brown dog. Time flies like an arrow, fruit flies like a banana. Time flies like an arrow, fruit flies like a banana. Now is the time for all good men to come to the aid of their party. The quick red fox jumped over the lazy brown dog. The quick red fox jumped over the lazy brown dog. Time flies like an arrow, fruit flies like a banana. Now is the time for all good men to come to the aid of their party. Now is the time for all good men to come to the aid of their party. [download] --roboticus	[reply] [d/l]
Re^3: Find duplicate lines from the file and write it into new file. by PerlInMyBlood (Initiate) on Jun 28, 2011 at 08:56 UTC
I used the script to find duplicate rows in my log file and it worked well. Thanks! However, it would be great if this script could be updated to display how may times the rows have been repeated as well. Please help!	[reply]
Re^4: Find duplicate lines from the file and write it into new file. by choroba (Cardinal) on Jun 28, 2011 at 09:17 UTC
Pretty easy, the numbers are stored in the hash already. Just remove the print and add the following at the end of the script: `print "$duplicates{$_}\t$_" for grep $duplicates{$_} > 1,keys %duplica +tes;` [download]	[reply] [d/l]