Smith has asked for the wisdom of the Perl Monks concerning the following question:

Still pretty weak on using hashes so I am hoping someone could please help. Im opening a file with domains in them and then pulling out only the root domain. Then I am trying to use the hash and grep to remove any duplicates but all I get back are blank lines.

#!/usr/bin/perl $upload = "/var/tmp/work/upload"; $work = "/var/tmp/work/"; $input3 = "$upload/domain.csv"; system ("dos2unix $input3"); open (IN,"$input3"); open (OUT,">>$work/local.rules"); while (<IN>) { chomp(); if ($_ =~ /^.+\.([A-Za-z0-9-_]+\.[A-Za-z]{2,})$/){ $domain = $1; %seen = (); @unique = grep { ! $seen{ $domain }++ } @array; print "@unique\n"; } }

Replies are listed 'Best First'.
Re: remove duplicates with hash and grep
by LanX (Saint) on Dec 22, 2014 at 21:29 UTC
    @array is empty cause never populated.

    warnings and strict would have told you...

    Cheers Rolf

    (addicted to the Perl Programming Language and ☆☆☆☆ :)

      Change it to this and am now getting the list but it has not removed the duplicates. Still lost

      #!/usr/bin/perl use warnings; use strict; my $upload = "/var/tmp/work/upload"; my $work = "/var/tmp/work/STAP-domain_clean_project"; my $input3 = "$upload/domain.csv"; system ("dos2unix $input3"); open (IN,"$input3"); open (OUT,">>$work/local.rules"); while (<IN>) { chomp(); if ($_ =~ /^.+\.([A-Za-z0-9-_]+\.[A-Za-z]{2,})$/){ my @array = $1; my %seen = (); my @unique = grep { ! $seen{ @array }++ } @array; print "@unique\n"; } }

        Your code redefines %seen and @array each time you read a line from your file. You get duplicates because your hash is always empty when you test it for a value.

        The following will produce a list of the unique values from IN:

        my (%seen,@unique); while (<IN>) { chomp; $seen{$1}++ if ($_ =~ /^.+\.([A-Za-z0-9-_]+\.[A-Za-z]{2,})$/); } @unique = keys %seen; printf "%s, ",$_ for @unique; print "\n";

        Updated for readability and coherence and typos (is it Monday already?)

        1 Peter 4:10
        ... now getting the list but it has not removed the duplicates.
        ...
            my @array = $1;

        On every iteration through the while-loop, this statement creates a new array and initializes it with a single element: the string that was captured to $1. This string is, of course, unique!

Re: remove duplicates with hash and grep
by toolic (Bishop) on Dec 22, 2014 at 21:29 UTC
Re: remove duplicates with hash and grep
by Smith (Initiate) on Dec 22, 2014 at 21:30 UTC

    Also if I print after I declare $domain I do get the list of domains, so I know the data is there.