Re: Re: Re: duplicate lines in array

try this:

#!/usr/bin/perl
use strict;

my (%count, $occ);
my @files = ("file1", "file2", "file3");

# read all files sequentially
foreach my $file (@files) {
    open (IN, "<$file") || die "could not open $file\n";
    while (<IN>) {
        chomp;
        # if line contains specified string add it to 'file' and 'foun
+d string' specific array
        /(H\(\d+\))/ && do { push @{$count{$file}{$1}}, $_ }
    }
    close (IN);
}

# loop over files
foreach my $file (keys %count) {
    print "$file\n";
    # loop over found strings
    foreach my $found (keys %{$count{$file}}) {
        # count occurences
        $occ = scalar (@{$count{$file}{$found}});
        # print found lines if occured more than 1 time
        if ($occ > 1) {
            foreach (@{$count{$file}{$found}}) {
                print "$_\n";
            }
        }
    }
    print "\n";
}
[download]

I had 3 files to test with:

file1:
N(8) -- H(15) .. O(9)
N(8) -- H(16) .. N(8)
N(8) -- H(16) .. O(9)

file2:
N(8) -- H(15) .. O(9)
N(8) -- H(15) .. N(8)
N(8) -- H(16) .. O(9)

file3:
N(8) -- H(15) .. O(9)
N(8) -- H(15) .. N(8)
N(8) -- H(16) .. O(9)
[download]

and it looks like:

file1
N(8) -- H(16) .. N(8)
N(8) -- H(16) .. O(9)

file2
N(8) -- H(15) .. O(9)
N(8) -- H(15) .. N(8)

file3
N(8) -- H(15) .. O(9)
N(8) -- H(15) .. N(8)
[download]

Imre

Comment on Re: Re: Re: duplicate lines in array Select or Download Code

Replies are listed 'Best First'.
Re: Re: Re: Re: duplicate lines in array by harry34 (Sexton) on Jan 27, 2004 at 12:39 UTC
That is exactly what I'm trying to achieve. Although I already have all the lines of data stored in @contact_type. All I need to do is iterate over @contact_type and check for the defind pattern and print if any are repeated. Can the code you have provided be changed to do that ? <br.cheers harry	[reply]
(Re:)* duplicate lines in array by pelagic (Priest) on Jan 27, 2004 at 13:00 UTC
Harry, when you already have all the lines in @contact_type, you have no information ready about the source of your lines. (i.e. what file did it come from) do you care about that? If yes I'd suggest to change the reading through the files to a similar way I did it. If no, tell me ... Imre	[reply]
Re: (Re:)* duplicate lines in array by harry34 (Sexton) on Jan 27, 2004 at 13:45 UTC
All the files have been opened and the lines of data have been stored in @contact_type. All I am looking to do something like this: `foreach (@contact_type) { ...check for repeated data in @contact_type; ...print it out;` [download]	[reply] [d/l]
Re: Re: (Re:)* duplicate lines in array by pelagic (Priest) on Jan 27, 2004 at 14:08 UTC