Remove duplicate lines from a file while preserving order and leaving the last instance of the duplicate in place.
perl -ne 'push @a, $_; $h{$_}++; END{print grep {not --$h{$_}} @a} +' file

For the following file:

cabbage apple banana grape pear banana carrot apple apple banana grape pear banana apple

The output would be:

cabbage carrot grape pear banana apple

So for example the last instance of apple is printed instead of the first. For comparison try the following which leaves the first instance in place:

perl -ne 'print unless $h{$_}++' file

If preserving order isn't important then you could use other system commands (if supported):

sort -u file sort file | uniq

Note: this snippet potentially stores 2 copies of each line in memory so it is inefficient for large files.

Replies are listed 'Best First'.
Re: Remove duplicate lines maintaining last-in order (alt)
by tye (Sage) on Sep 23, 2003 at 16:06 UTC

    Cool. Here is a somewhat more memory-efficient alternative:

    perl -ne '$a[$h{$_}]= ""; push @a, $_; $h{$_}= $#a; END{print for @a}' + file

    Or, if you want to cut even that memory foot-print (or does it?) roughly in half (at the cost of more CPU):

    perl -ne '$h{$_}= $.; END{print for sort {$h{$a} <=> $h{$b}} keys %h}' + test.txt

                    - tye