Re: Getting rid of duplicates

The following one-liner does what you want and has the advantages of handling the multiple representations available for numbers and preserving the order of the input lines. I expect that both are important to you because you didn't just use sort -nu to solve your problem:

perl -lne 'print unless $counts{0+$_}++' input.txt > output.txt
[download]

We use the -lne command-line switches to cause Perl to read each line of input, strip off the line break, and then execute the following code on the result:

    print unless $counts{0+$_}++
[download]

The code prints the current line if the count of times we have seen it so far is zero. We use the hash %counts to keep track of the counts. Note the 0+ inside of the hash index. It ensures that the input lines are interpreted as numbers so that, for example, "1" and "1.0" are considered to be the same for the sake of duplicate removal.

Cheers,
Tom

Tom Moertel : Blog / Talks / CPAN / LectroTest / PXSL / Coffee / Movie Rating Decoder

Comment on Re: Getting rid of duplicates Select or Download Code