Another way to remove duplicates is to just use the command line sort. Command line sort is not limited to having the entire file memory resident and can sort a HUGE file. Then cycle through that sorted file and don't output lines if the current line matched the immediately preceding line.
knoppix@Microknoppix:~$ cat rubbish
cat
fish
dog
apple
cat
bird
knoppix@Microknoppix:~$ sort rubbish | uniq
apple
bird
cat
dog
fish
knoppix@Microknoppix:~$