in reply to How to match duplicate lines in a text file and extract only one of those lines to a new file
This does the job:
my %data; while (<DATA>) { chomp; my ($firstnum, $secondnum, $thingy, @bits) = split /\s/; my $key = sprintf("%s\x00%s\x00%s", $firstnum, $secondnum, $thingy +); for my $i (0 .. $#bits) { $data{$key}[$i] = [] unless exists $data{$key}[$i]; push @{ $data{$key}[$i] }, $bits[$i]; } } foreach my $key (sort keys %data) { print join q[ ], split "\x00", $key; print q[ ]; print join q[ ], map { join '/', @$_ } @{ $data{$key} }; print "\n"; } __DATA__ 1 51 Brahui A C A A T 1 51 Brahui A C A G T 3 51 Brahui A C A G C 3 51 Brahui A C G A T 5 51 Brahui A C G A T 5 51 Brahui A C G G C 7 51 Brahui A C G A T 7 51 Brahui A C G G T 9 51 Brahui A C G G T 9 51 Brahui A C G G T
But don't just copy that as-is. Try to understand how it works. What you want to look at is:
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: How to match duplicate lines in a text file and extract only one of those lines to a new file
by danica (Initiate) on Apr 04, 2012 at 13:26 UTC | |
by aaron_baugher (Curate) on Apr 04, 2012 at 14:37 UTC | |
by danica (Initiate) on Apr 05, 2012 at 09:26 UTC |