Arrays merges and redundancies

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Arrays merges and redundancies by kennethk (Abbot) on Mar 30, 2012 at 18:34 UTC
The most natural way to generalize the code IMHO is to change to a loop over array references rather than treating each array individually - see perlreftut. You could put this all in a complex data structure (perldsc) rather than how you have it loaded now, but you could generalize what you have as something like: `if (($Entry > 0) && ($ID[$Entry] eq $ID[$Entry - 1])) { for my $ref (\@Value_1, \@Value_2, \@Value_3, \@Value_4, \@Value_5, \@Value_6, \@Value_7, ) { if ($ref->[$Entry] =~ m/($ref->[$Entry - 1])/) { $Placeholder = join(' / ', $ref->[$Entry], $ref->[$Entry - 1] +); delete $ref->[$Entry - 1]; push(@Value, $Placeholder); } } }` [download] Of course, the above is buggy since your posted code was already buggy -- you've got index problems on your counter as soon as you start deleting elements, and that push isn't helping. Also note the added comma in your join. But the loop should answer your question. #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.	[reply] [d/l]
Re: Arrays merges and redundancies by Marshall (Canon) on Mar 30, 2012 at 20:19 UTC
Here is one way...using use List::MoreUtils qw(each_array uniq);...Uses a hash of array (HoA) to gather up the values associated with the ID, then that hash is accessed in the same order as the ID array and values are compressed, re-formatted. Update: changed code so that it works with non-numeric ID's instead of printing, you can push a reference to \@new onto a new @Compressed array... each_array() makes an iterator that pulls pairs of numbers walking left to right from the ID array and the Value array row that we are working on. uniq() removes duplicates (unique values only - order is preserved). see List::MoreUtils #!/usr/bin/perl -w use strict; use List::MoreUtils qw(each_array uniq); use Data::Dumper; my @ID = qw(1 1 2 3 4 4 6 6); my @Values = ([qw(a a a a a a a a )], [qw(b b b b b b b b )], [qw(c d c c c d c d )]); print join(" ",uniq @ID), "\n"; #compressed ID's foreach my $row_ref (@Values) { # make hash to gather up values for each ID # eg: 1 => [c, d] my %id2values; my $ea = each_array(@ID, @$row_ref); while ( my ($id, $value) = $ea->() ) { push @{$id2values{$id}}, $value; } # compress out the dupe values then make a "c/d" string # if there is more than one value my @new = map{join ("/", uniq @{$id2values{$_}})} uniq @ID; print "@new\n"; #compressed values } __END__ 1 2 3 4 6 a a a a a b b b b b c/d c c c/d c/d [download] Update Also tested with the other test case, the above code produces the correct result of: `Apple Grape Banana 5 2 3/4 10/15 3 4 for: my @ID = qw(Apple Apple Grape Banana Banana); my @Values = ([qw(5 5 2 3 4 )], [qw(10 15 3 4 4 )], );` [download]	[reply] [d/l] [select]
Re^2: Arrays merges and redundancies by remiah (Hermit) on Mar 31, 2012 at 02:54 UTC
I tried. use strict; use warnings; use Data::Dumper; my @IDs = qw(Apple Apple Grape Banana Banana); my @Price = qw(5 5 2 3 4 ); my @Amount = qw(10 15 3 4 4 ); sub uniqjoin { my %seen; my $sep=shift; return join($sep, grep { !$seen{$_}++ } sort @_); } my ($idx, $id, %h); #to hash of array while( ($idx,$id)= each(@IDs) ){ push @{$h{$id}->{price}} , $Price[$idx]; push @{$h{$id}->{amount}}, $Amount[$idx]; } #concatenate arrays with '/' foreach my $id (keys %h){ $h{$id}->{price} = uniqjoin( '/', @{$h{$id}->{price}} ); $h{$id}->{amount} = uniqjoin( '/', @{$h{$id}->{amount}} ); } #print print join(' ', sort keys(%h)) . "\n"; print join(' ' , map{ $h{$_}->{price} } sort keys(%h) ) . "\n"; print join(' ' , map{ $h{$_}->{amount} } sort keys(%h) ) . "\n"; [download] Sometimes I see List::Util and List::MoreUtil at Monk. I should see them.	[reply] [d/l]
Re^3: Arrays merges and redundancies by Marshall (Canon) on Mar 31, 2012 at 04:24 UTC
You were close.. I fixed your code? ...Don't sort the IDs - if you want to preserve the order of the data! Also, its a fluke that the data appeared in the right order due to sort in uniqjoin(). List::Util is a core module (no installation required) - you will have to install List::MoreUtil if you want to use it. Besides being very handy functions, they are fast because they are implemented in C. use strict; use warnings; use Data::Dumper; my @IDs = qw(Apple Apple Grape Banana Banana); my @Price = qw(5 5 2 3 4 ); my @Amount = qw(10 15 3 4 4 ); sub uniqjoin { my $sep=shift; my %seen; return join($sep, grep { !$seen{$_}++ } @_); #NO SORT } my %h; #to hash of array my $idx =0; foreach my $id (@IDs) { push @{$h{$id}->{price}} , $Price[$idx]; push @{$h{$id}->{amount}}, $Amount[$idx]; $idx++; } #concatenate arrays with '/' foreach my $id (keys %h){ $h{$id}->{price} = uniqjoin( '/', @{$h{$id}->{price} } ); $h{$id}->{amount} = uniqjoin( '/', @{$h{$id}->{amount}} ); } my %seen; my @uniqIDs = grep{!$seen{$_}++}@IDs; #print print join(' ', @uniqIDs) . "\n"; print join(' ' , map{ $h{$_}->{price} } @uniqIDs ) . "\n"; print join(' ' , map{ $h{$_}->{amount} } @uniqIDs ) . "\n"; __END__ Apple Grape Banana 5 2 3/4 10/15 3 4 [download] update: notice that keys %h does not come out in any particular order - make @uniqIDs and use that array to enforce the ordering of the data to be like in the original when you print it out - and this will wind up being faster than doing multiple "sort keys %h" anyway There is some redundancy in the code, you could make your own uniq() function like the one in List:MoreUtil and use in join() - but why bother? Use the module and get the advantage of well debugged, fast code (should run faster than a pure Perl implementation). Of course there are some scalability/re-usability issues too - due to hard coding of price and value - but if this meets your needs - go for it! If you use a Perl 5.12 feature like the each() function for arrays, I would put a "use 5.012;" statement in the code. Many people like me are still at 5.10.1.	[reply] [d/l]
Re^4: Arrays merges and redundancies by remiah (Hermit) on Mar 31, 2012 at 13:45 UTC
Re^4: Arrays merges and redundancies by Anonymous Monk on Mar 31, 2012 at 10:27 UTC
Re^4: Arrays merges and redundancies by Anonymous Monk on Mar 31, 2012 at 10:49 UTC
Re: Arrays merges and redundancies by Riales (Hermit) on Mar 30, 2012 at 18:35 UTC
I'm a little confused by your explanation, but I think you're using arrays when you probably want hashes? Maybe something like... `my %hash = ( 1 => [qw/a a a a a a a a/], 2 => [qw/b b b b b b b b/], 3 => [qw/c d c c c d c d/], );` [download] But I'm still not very clear on what you're trying to do...	[reply] [d/l]
Re^2: Arrays merges and redundancies by Anonymous Monk on Mar 30, 2012 at 19:51 UTC
From the following studpi example; `@IDs = qw(Apple Apple Grape Banana Banana) @Price = qw(5 5 2 3 4 ) @Amount = qw(10 15 3 4 4 ) I want; @IDs = qw(Apple Grape Banana) @Price = qw(5 2 3/4) @Amount = qw(10/15 3 4)` [download] Ithought about using hashes but for that I need unique key and IDs are not unique. Sorry for vague explanations and thanks for help.	[reply] [d/l]