hiyall has asked for the wisdom of the Perl Monks concerning the following question:

Newbie difficulty with converting input of file of non-unique triples into a list of unique triples

Input file: logins.txt

server1,user1,% server1,user1,db1 server1,user2,% server1,user3,% server1,user1,% server1,user2,% server1,user2,db2 server1,user3,db3 server1,user3,% server2,user1,% server2,user1,db1 server2,user2,% server2,user3,% server2,user1,% server2,user2,% server2,user2,db2 server2,user3,db3 server2,user3,% server3,user1,% server3,user1,db1 server3,user2,% server3,user3,% server3,user1,% server3,user2,% server3,user2,db2 server3,user3,db3 server3,user3,%

Attempt at script which is not correct

#!/usr/bin/perl -w use strict; use Data::Dumper qw(Dumper); my (@nulogins, @logins); my %foo=(); my $fn = 'logins.txt'; open(my $fh, '<', $fn) or die "Could not open file '$fn' $!"; while (my $line = <$fh> ) { chomp($line); my @login = split /,/,$line; push @nulogins, @login; #print Dumper \@login; } print Dumper \@nulogins; for (@nulogins) { $foo{$_}++ }; @logins = (keys %foo); print Dumper \@logins; exit;

Desired result for contents of @logins

@logins = ( (server1,user1,%), (server1,user1,db1), (server1,user2,%), (server1,user1,%), (server1,user1,db1), (server1,user2,%), (server1,user2,db2), (server1,user3,%), (server1,user3,db3), (server2,user1,%), (server2,user1,db1), (server2,user2,%), (server2,user2,db2), (server2,user3,%), (server2,user3,db3), (server3,user1,%), (server3,user1,db1), (server3,user2,%), (server3,user2,db2), (server3,user3,%), (server3,user3,db3) ) ;

such that e.g. $logins[0] = (server1,user1,%) and $logins[0][0] = server1

Replies are listed 'Best First'.
Re: getting unique AOH from nonunique AOH ... or hash if it is better approach
by MidLifeXis (Monsignor) on Feb 05, 2015 at 13:58 UTC

    Your line push @nulogins, @login; is pushing each element of @login into @newlogins. I think you mean to push a reference to the entire @login array into the @newlogins list. push @newlogins, \@login. See perlref.

    --MidLifeXis

Re: getting unique AOH from nonunique AOH ... or hash if it is better approach
by gpapkala (Acolyte) on Feb 05, 2015 at 14:13 UTC
    You can use hash to get unique values like this. And as suggested by MidLifeXis use array ref per each entry.
    #!/usr/bin/perl -w use strict; use Data::Dumper qw(Dumper); my (@logins); my %foo=(); my $fn = 'logins.txt'; open(my $fh, '<', $fn) or die "Could not open file '$fn' $!"; while (my $line = <$fh> ) { chomp($line); $foo{$line} = [ split /,/, $line ]; } @logins = values(%foo); print Dumper @logins; print $logins[0][0]; exit;

      Totally awesome! Thank you ... now on to study array refs

Re: getting unique AOH from nonunique AOH ... or hash if it is better approach
by johngg (Canon) on Feb 06, 2015 at 12:04 UTC

    If you need to retain the same order in your unique array as the original a different approach is required as hashes are inherently unordered. The grep uses the %seen hash to only pass through data elements that it hasn't already seen, thus removing duplicates from the stream. The first (lower) map takes each line read from the input file and removes any trailing white space, including line terminators (I do this instead of chomping in case there are differing numbers of trailing spaces in the data). The grep removes duplicates as already described and the second (upper) map creates an anonymous array of data split on commas.

    $ perl -Mstrict -Mwarnings -MData::Dumper -E ' open my $inFH, q{<}, \ <<EOF or die $!; server1,user1,% server1,user1,db1 server1,user2,% server1,user3,% server1,user1,% server1,user2,% server1,user2,db2 server1,user3,db3 server1,user3,% server2,user1,% server2,user1,db1 server2,user2,% server2,user3,% server2,user1,% server2,user2,% server2,user2,db2 server2,user3,db3 server2,user3,% server3,user1,% server3,user1,db1 server3,user2,% server3,user3,% server3,user1,% server3,user2,% server3,user2,db2 server3,user3,db3 server3,user3,% EOF my @logins = do { my %seen; map { [ split m{,} ] } grep { ! $seen{ $_ } ++ } map { s{\s*$}{}; $_ } <$inFH>; }; print Data::Dumper->Dumpxs( [ \ @logins ], [ qw{ *logins } ] );' @logins = ( [ 'server1', 'user1', '%' ], [ 'server1', 'user1', 'db1' ], [ 'server1', 'user2', '%' ], [ 'server1', 'user3', '%' ], [ 'server1', 'user2', 'db2' ], [ 'server1', 'user3', 'db3' ], [ 'server2', 'user1', '%' ], [ 'server2', 'user1', 'db1' ], [ 'server2', 'user2', '%' ], [ 'server2', 'user3', '%' ], [ 'server2', 'user2', 'db2' ], [ 'server2', 'user3', 'db3' ], [ 'server3', 'user1', '%' ], [ 'server3', 'user1', 'db1' ], [ 'server3', 'user2', '%' ], [ 'server3', 'user3', '%' ], [ 'server3', 'user2', 'db2' ], [ 'server3', 'user3', 'db3' ] ); $

    I hope this is of interest.

    Cheers,

    JohnGG