in reply to Arrays merges and redundancies

Here is one way...using use List::MoreUtils qw(each_array uniq);...Uses a hash of array (HoA) to gather up the values associated with the ID, then that hash is accessed in the same order as the ID array and values are compressed, re-formatted.

Update: changed code so that it works with non-numeric ID's
instead of printing, you can push a reference to \@new onto a new @Compressed array...
each_array() makes an iterator that pulls pairs of numbers walking left to right from the ID array and the Value array row that we are working on. uniq() removes duplicates (unique values only - order is preserved). see List::MoreUtils

#!/usr/bin/perl -w use strict; use List::MoreUtils qw(each_array uniq); use Data::Dumper; my @ID = qw(1 1 2 3 4 4 6 6); my @Values = ([qw(a a a a a a a a )], [qw(b b b b b b b b )], [qw(c d c c c d c d )]); print join(" ",uniq @ID), "\n"; #compressed ID's foreach my $row_ref (@Values) { # make hash to gather up values for each ID # eg: 1 => [c, d] my %id2values; my $ea = each_array(@ID, @$row_ref); while ( my ($id, $value) = $ea->() ) { push @{$id2values{$id}}, $value; } # compress out the dupe values then make a "c/d" string # if there is more than one value my @new = map{join ("/", uniq @{$id2values{$_}})} uniq @ID; print "@new\n"; #compressed values } __END__ 1 2 3 4 6 a a a a a b b b b b c/d c c c/d c/d
Update Also tested with the other test case, the above code produces the correct result of:
Apple Grape Banana 5 2 3/4 10/15 3 4 for: my @ID = qw(Apple Apple Grape Banana Banana); my @Values = ([qw(5 5 2 3 4 )], [qw(10 15 3 4 4 )], );

Replies are listed 'Best First'.
Re^2: Arrays merges and redundancies
by remiah (Hermit) on Mar 31, 2012 at 02:54 UTC

    I tried.

    use strict; use warnings; use Data::Dumper; my @IDs = qw(Apple Apple Grape Banana Banana); my @Price = qw(5 5 2 3 4 ); my @Amount = qw(10 15 3 4 4 ); sub uniqjoin { my %seen; my $sep=shift; return join($sep, grep { !$seen{$_}++ } sort @_); } my ($idx, $id, %h); #to hash of array while( ($idx,$id)= each(@IDs) ){ push @{$h{$id}->{price}} , $Price[$idx]; push @{$h{$id}->{amount}}, $Amount[$idx]; } #concatenate arrays with '/' foreach my $id (keys %h){ $h{$id}->{price} = uniqjoin( '/', @{$h{$id}->{price}} ); $h{$id}->{amount} = uniqjoin( '/', @{$h{$id}->{amount}} ); } #print print join(' ', sort keys(%h)) . "\n"; print join(' ' , map{ $h{$_}->{price} } sort keys(%h) ) . "\n"; print join(' ' , map{ $h{$_}->{amount} } sort keys(%h) ) . "\n";
    Sometimes I see List::Util and List::MoreUtil at Monk. I should see them.

      You were close.. I fixed your code? ...Don't sort the IDs - if you want to preserve the order of the data! Also, its a fluke that the data appeared in the right order due to sort in uniqjoin().

      List::Util is a core module (no installation required) - you will have to install List::MoreUtil if you want to use it. Besides being very handy functions, they are fast because they are implemented in C.

      use strict; use warnings; use Data::Dumper; my @IDs = qw(Apple Apple Grape Banana Banana); my @Price = qw(5 5 2 3 4 ); my @Amount = qw(10 15 3 4 4 ); sub uniqjoin { my $sep=shift; my %seen; return join($sep, grep { !$seen{$_}++ } @_); #NO SORT } my %h; #to hash of array my $idx =0; foreach my $id (@IDs) { push @{$h{$id}->{price}} , $Price[$idx]; push @{$h{$id}->{amount}}, $Amount[$idx]; $idx++; } #concatenate arrays with '/' foreach my $id (keys %h){ $h{$id}->{price} = uniqjoin( '/', @{$h{$id}->{price} } ); $h{$id}->{amount} = uniqjoin( '/', @{$h{$id}->{amount}} ); } my %seen; my @uniqIDs = grep{!$seen{$_}++}@IDs; #print print join(' ', @uniqIDs) . "\n"; print join(' ' , map{ $h{$_}->{price} } @uniqIDs ) . "\n"; print join(' ' , map{ $h{$_}->{amount} } @uniqIDs ) . "\n"; __END__ Apple Grape Banana 5 2 3/4 10/15 3 4
      update: notice that keys %h does not come out in any particular order - make @uniqIDs and use that array to enforce the ordering of the data to be like in the original when you print it out - and this will wind up being faster than doing multiple "sort keys %h" anyway

      There is some redundancy in the code, you could make your own uniq() function like the one in List:MoreUtil and use in join() - but why bother? Use the module and get the advantage of well debugged, fast code (should run faster than a pure Perl implementation). Of course there are some scalability/re-usability issues too - due to hard coding of price and value - but if this meets your needs - go for it!

      If you use a Perl 5.12 feature like the each() function for arrays, I would put a "use 5.012;" statement in the code. Many people like me are still at 5.10.1.

        Thanks a lot for fix and comment.

        I was careless for the order as you pointed out.

        my @uniqIDs = grep{!$seen{$_}++}@IDs;
        with this, I can make use of the original order. And with List::MoreUtils, I can use unique or distinct keeping its order.

        There seems to be a lot of interesting functions in List::MoreUtils. mesh,natatime and each_array ...umm... It is very interesting and I saw moritz often make use of them.

        I have one perl 5.8 box and I would be burned if I didn't know that "each can return index for an array" is an functionality of 5.12.

        regards

        <Thanks guy for all your input, it's very helpful.

        Thanks for help everyone, really good advice.