in reply to How to extract lines starting with new names/words

You can keep these first words in a hash and check if they have already been stored:
#!/usr/bin/perl use strict; use warnings; my %seen_words; while (<DATA>){ if (!m/^(\S+)/){ die "Invalid line: $_"; } my $first_word = $1; if (!$seen_words{$first_word}){ print; $seen_words{$first_word} = 1; } } __DATA__ MA01001A1A03.f1 760 5640111 ad1 MA01001A1A03.f1 760 42572233 ubq MA01001A1A04.f1 300 15232924 ubq MA01001A1A04.f1 300 145334669 DNA MA01001A1B22.f1 580 77745475 ra MA01001A1B22.f1 580 30409730 ra

This can be written a little bit compacter:

while (<DATA>){ if (!m/^(\S+)/){ die "Invalid line: $_"; } print unless $seen_words{$1}++; }

But the first one is easier to read for the beginner ;-)

Replies are listed 'Best First'.
Re^2: How to extract lines starting with new names/words
by sm2004 (Acolyte) on Mar 13, 2008 at 23:28 UTC
    Thanks so much! That was perfect. Exactly, what I wanted it to do... I spent several days trying to do this. Just learned perl two weeks ago. Thanks again.