sanku has asked for the wisdom of the Perl Monks concerning the following question:

hi monks, Can any one help me. if i am using the commented @array1 list then it's working fine if i am trying to give the values of samplefile.txt value then it's not working fine. samplefile.txt will have the values like this
N01A0000.f BG1_c22 N01A000X.f BG1_c5 N01A000X.r BG1_c5 N01A002B.f BG1_c38 N01A002B.r BG1_c38 N01A0082.r BG1_c12 N01A00AS.f BG1_c52 N01A00B9.f BG1_c45 N01A00B9.r BG1_c45 N01A00DK.f BG1_c5 N01A00F0.f BG1_c22 N01A00F0.r BG1_c22 N01A00F3.f BG1_c14 N01A00FX.f BG1_c7 N01A00FX.r BG1_c7
from that file i have to completely remove the duplicated elements from that list.
open(FILE,"/Downloads/samplefile.txt") or die $!; while(<FILE>){ $_=~s/\s+//g; push(@array1,$_); } close(FILE); #@array1= qw(hi bye hi see u later); my %s = (); $s{$_}++ for @array1; for my $value ( keys %s ) { print "\n".$s{$value}."\n"; if ($s{$value} > 1){ $v=$value; } if($value eq $v){ delete $s{$value}; } else{ print "$value"; } }
Thanks in advance

Replies are listed 'Best First'.
Re: doubt in perl
by graff (Chancellor) on Dec 08, 2008 at 04:52 UTC
    What does "not working fine" mean? (For that matter, what does "working fine" mean?)

    When you do this part:

    while(<FILE>){ $_=~s/\s+//g; push(@array1,$_); } my %s = (); $s{$_}++ for @array1;
    Is it really your intention to create a hash whose keys look like this:
    N01A0000.fBG1_c22 N01A000X.fBG1_c5 N01A000X.rBG1_c5 ...
    The list you showed us has 15 lines. When the lines are converted to hash keys using your method, they are all unique. What were you expecting to get as a result?

    Update: By the way, why do you use an array and a hash? You could have just done this:

    my %s; while(<FILE>) { s/\s+//g; $s{$_}++; }
    (Of course, for the data you've shown, that will still end up with all lines being unique. So, what is it about the data that makes two or more lines duplicates?)
Re: doubt in perl
by moritz (Cardinal) on Dec 08, 2008 at 06:38 UTC
Re: doubt in perl
by gone2015 (Deacon) on Dec 08, 2008 at 09:09 UTC

    Having read your file into @array1, your code proceeds:

    1: my %s = (); 2: $s{$_}++ for @array1; 3: for my $value ( keys %s ) { 4: print "\n".$s{$value}."\n"; 5: if ($s{$value} > 1){ 6: $v=$value; 7: } 8: if ($value eq $v){ 9: delete $s{$value}; 10: } 11: else { 12: print "$value"; 13: } 14: }
    so after line 2, you have a hash whose keys are the input lines and whose values are the count of times each appeared. So far so good.

    You then loop through the keys, and presumably want to print out the lines with count == 1. I'm not sure why you chose this particular method for identifying those lines, but I can see that when the count == 1 then $value ne $v... because when count > 1 lines 5-7 arrange $v, which otherwise will be undefined or the last $value which had a count == 1 (and that definitely won't be the same as the current $value). However, I'd be happier if $v had been initialised. The delete on line 9 appears redundant... what's worse, you are removing entries from the hash in the middle of the loop which is working its way through the hash -- I cannot tell you whether that works, or not. I suggest that lines 5..13 can be simplified !

    I wonder if you are disappointed about the order in which the unique lines are printed out ? If so, I suggest a read of keys which may clear up the mystery. But do not dispair: consider what you have in @array1...

    For completeness: make friends with use strict and use warnings -- you won't regret it.