in reply to Re^2: how to remove duplicate strings?
in thread how to remove duplicate strings?

Well no, I'm not sure that I got it. What is clear is that you did not satisfy GrandFather's request in the first reply, as I hoped you would.

So let me make another guess at what you really want. How about this:

my @arr = (); while (<PIR>) { chomp; if( /^ENTRY/ ) { $entry = $_ } elsif ( /^(TITLE)\s+(\S.*)/ ) { $title = "$1\n\t $2" } elsif ( /^(ORGANISM)\s+(\S.*)/ ) { $org = "$1\n\t $2" } elsif ( /^ACCESSIONS/ ) { $acc = $_ } else { push @arr, $_; } } print "@arr\n";
Now, I would assume there should be more code than that, if you really need to do things with $acc, $entry, $org and $title. If you really just want to output an array with those long strings as the elements of the array, the code could be a lot simpler.

If there's a chance that one of those long strings might appear more than once in the data file, use those long strings as hash keys instead of array values:

# simplified version: ignore header stuff: my %hash; while(<PIR>) { chomp; $hash{$_} = undef unless /^(?:ENTRY|TITLE|ORGANISM|ACCESSIONS)\s/; } print join " ", keys %hash, "\n";
Using a hash like that might be a good idea for other reasons: maybe you would want the header values to be associated with each long string. (Hint: some people refer to hashes as "associative arrays".) If so, assign the header strings as the hash value.

Replies are listed 'Best First'.
Re^4: how to remove duplicate strings?
by heidi (Sexton) on Oct 30, 2006 at 09:08 UTC
    hey graff, thank ya, u got my problem rite.i tried writting the code the way u said, and i got the answer, but the problem which i am facing now is, i had to save each element of that array in to a new array and split the characters. to make it clear, the program is now like this.
    open (PIR,'/home/guest/sampir.txt'); my @arr = (); while (<PIR>) { chomp; if( /^ENTRY/ ) { $entry = $_ } elsif ( /^(TITLE)\s+(\S.*)/ ) { $title = "$1\n\t $2" } elsif ( /^(ORGANISM)\s+(\S.*)/ ) { $org = "$1\n\t $2" } elsif ( /^ACCESSIONS/ ) { $acc = $_ } else { push @se, $_; } }
    and i tried splitting it up like this
    foreach $r(@se) { @y=split(//,$r); }
    but am not getting the answer. how to go abt it.?