in reply to extracting duplicates from a list

Here is code that parses your data format and flags sequences that are unique in your data. The  %count hash can also be used to process duplicates.
#!/usr/bin/perl -w use strict; my (%count, %ann); while (my $line = <DATA>) { chomp $line; if ($line =~ /^>atc:(\S+)\s+(\w+)$/) { $count{$2}++; $ann{$2} = $1; } } foreach my $seq (keys %count) { print "Unique seq $seq has ann $ann{$seq}\n" if $count{$seq} == 1; } __DATA__ >atc:AGR_pTi_39_1-45_FD cctttcaagtcatagaacaccggggcatgtacaacttggggaagg >atc:AGR_pTi_47_1-45_FD ccttacaggtcattgagcacagaggaatgttcaatttagggaaac >atc:AGR_pTi_39_1-45_F cctttcaagtcatagaacaccggggcatgtacaacttgggga +agg >atc:AGR_pTi_47_1-45_F ccttacaggtcattgagcacagaggaatgttcaatttaggga +aac >atc:AGR_pTi_39_1-45_RD cctttcaagtcatagaacaccggggcatgtacaacttggggaagg >atc:AGR_pTi_47_1-45_RD ccttacaggtcattgagcacagaggaatgttcaatttagggaaac >atc:AGR_pTi_39_1-45_R cctttcaagtcatagaacaccggggcatgtacaacttggggaagg

-Mark