neo1491 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks (apologies for the long title),

My quest for perl enlightenment has once again brought me to the Monastery seeking wisdom.
I'm searching for a way to prevent pushing duplicate information into an array... here's an example:

I'm reading a book looking for three words on the same line... say "perl", "monks", and "rule".
I come across the line "I think Perl Monks Rule". So I use the regular expression below to push that entire line into an array for later use:

my @Book = (); my @MatchFound = (); foreach (@Book) if ( $_ =~ /\bPerl\b/i && /\bMonks\b/i && /\bRule\b/i ) { push @MatchFound, $_; }
On the next page of the book, I run across the same line "I think Perl Monks rule",
but I don't want to push it into the array @MatchFound because it's already there.

How do I scan the array @MatchFound prior to pushing information to it, to determine if that information is already there?

Or, would it be better to wait until I finish the book, then delete all duplicate lines from @MatchFound (which I have no idea how to do either)?

I know there has to be a term for this type of search... something really abstract like unique array identifier or something. Any ideas?

Replies are listed 'Best First'.
Re: Preventing duplicate info from being pushed into an array
by FunkyMonk (Bishop) on Apr 01, 2008 at 19:07 UTC
    You could use something like
    my @Book = (); my @MatchFound = (); my %seen; foreach (@Book) { if ( /\bPerl\b/i && /\bMonks\b/i && /\bRule\b/i ) { push @MatchFound, $_ unless $seen{$_}++; } }

    Most often, when you're after a unique list in Perl, you want to use a hash

Re: Preventing duplicate info from being pushed into an array
by apl (Monsignor) on Apr 01, 2008 at 19:39 UTC
    To slightly simplify what FunkyMonk said, if you're not concerned about preserving the order the matches were found:

    my @Book = (); my %MatchFound; foreach (@Book) { $MatchFound{$_}++ if ( /\bPerl\b/i && /\bMonks\b/i && /\bRule\b/i ); }

    or even

    my @Book = (); my %MatchFound = map{ $MatchFound{$_}++ if ( /\bPerl\b/i && /\bMonks\b +/i && /\bRule\b/i ) } @Book;

    Both, alas, are untested...

Re: Preventing duplicate info from being pushed into an array
by wade (Pilgrim) on Apr 01, 2008 at 20:24 UTC
    Isn't this why God invented hashes? Instead of:
    push @MatchFound, $_;
    You can:
    %MatchFound{$_} = 1;
    You know, with the appropriate declaration and algorithm changes...
    --
    Wade