in reply to Re: Re: finding duplicate data
in thread finding duplicate data

Which part has you puzzled? grep? for () implicitly setting $_? (If the former, you should be able to run "perldoc -f grep" to get a description of what grep does. If for some reason you have a broken perl that doesn't include perldoc, try here.)

Replies are listed 'Best First'.
Re: Re: Re: Re: finding duplicate data
by harry34 (Sexton) on Jan 21, 2004 at 11:55 UTC
    What if I have an array called (@hydrogen_split).
    Could I not write your code like this ?

    foreach $i(0..@hydrogen_split){ chomp; $hydrogen_split{$key}++; } for ( sort grep { $hydrogen_split{$key} !=1 } keys %hydrogen_split){ print "$key\n"; }
      I didn't post any code. I think the code to which you refer was by borisz. Anyway, you have it a little wrong:
      # loop over the array values, not indexes, making $key # refer to each one in turn foreach my $key (@hydrogen_split) { # chomp is probably not needed if your data is already # in an array. chomp removes the end-of-line character # from a line of input # use the %hydrogen_split hash to keep track of which # keys were seen how many times $hydrogen_split{$key}++ } # print in sorted order each key that was encountered more than once foreach my $key ( sort grep { $hydrogen_split{$key} > 1 } keys %hydrog +en_split ) { print "$key\n"; }
      Note that I changed his != 1 to > 1. Do you understand why?
        I appreciate your help, but the I still can't get it to do what I want.
        I have 30 seperate files which are opened individually and from each file infomation is extracted, e.g for files 1 and 2:
        file 1: H(15) H(16) file 2: H(15) H(15) H(16)
        Note all the information from all 30 files is stored in @hydrogen_split.

        So interating over the information from file 1 should not output anything, but from file 2 H(15) should be displayed as it is repeated.
        With the following code H(15) H(16) is outputed ?? Is there something wrong ?
        foreach my $key (@hydrogen_split) { $hydrogen_split{$key}++; } foreach my $key ( sort grep { $hydrogen_split{$key} > 1 } keys %hydrog +en_split){ print "$key\n"; }
        Thanks a very confused Harry