in reply to Re: How to match duplicate lines in a text file and extract only one of those lines to a new file
in thread How to match duplicate lines in a text file and extract only one of those lines to a new file

Hiya, Thank you so much for your help, I tried to run your code just to see how it works. One thing I noticed when I look at the output is that the first column doesn't seem to get transformed. Some duplicates also seem to have been missed.

Like so:

1 Brahui A C/C A/A A/G T/T

100 Hazara A C G A T C C

100 Hazara G C A A T C T

102 Hazara A C/C G/G A/G

  • Comment on Re^2: How to match duplicate lines in a text file and extract only one of those lines to a new file

Replies are listed 'Best First'.
Re^3: How to match duplicate lines in a text file and extract only one of those lines to a new file
by aaron_baugher (Curate) on Apr 04, 2012 at 14:37 UTC

    In your original sample data, every line began with two integers and then a text string. Now you seem to be running it on lines that begin with a single integer and a text string, so his code is picking up the first allele as part of the duplicated section.

    Aaron B.
    My Woefully Neglected Blog, where I occasionally mention Perl.

      Oh yes of course! Thank you for pointing out such an obvious mistake!