Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

How to Group List Items?

by Thelonious (Scribe)
on Jul 26, 2007 at 19:38 UTC ( [id://628986]=perlquestion: print w/replies, xml ) Need Help??

Thelonious has asked for the wisdom of the Perl Monks concerning the following question:

Oh Wise Ones,

I have lists, and I need to group them with items of the same value stored in the same group, and each group having as close to the same number of items as possible, with a maximum of X groups (assuming 6 here)...

my $best_score = 9999; my $best_combo; my @list = qw(1 1 1 2 3 4 4 4 5 5 6 6 6 7 7 8 8 8); while (my $combo = get_combo(\@list)) { # hard to do well?? $score = get_score($combo); # easy if ($score < $best_score) { $best_score = $score; $best_combo = $combo; } } print $best_score.': '.Dumper($best_combo);
That would group these items like this:
$VAR1 = [ [qw(1 1 1 2)], [qw(3 4 4 4)], [qw(5 5 )], [qw(6 6 6)], [qw(7 7)], [qw(8 8 8 9)], ];
I know that there must be some algorithm out there to get this done, but it's perplexing me... any thoughts?

Many Thanks!

Replies are listed 'Best First'.
Re: How to Group List Items?
by BrowserUk (Patriarch) on Jul 27, 2007 at 03:14 UTC

    Looking at the other responses, I think I may have misunderstood you. You say that " items of the same value stored in the same group", but nowhere do you say that only contiguous ranges of values can be stored in a group.

    If the latter is not a requirement, then this may be of interest. It does an optimal job (by my understanding) on the supplied dataset. The output order coudl be sorted (by say the first value in each group) to bring it back more in line with the starting set:

    #! perl -slw use strict; use Data::Dump qw[ dump ]; $Data::Dump::MAX_WIDTH = 40; use List::Util qw[ reduce ]; sub group { my $nGroups = shift; my $ave = @_ / $nGroups; my %groups; push @{ $groups{ $_ } }, $_ for @_; while( keys %groups > $nGroups ) { my %sizes; push @{ $sizes{ @{ $groups{ $_ } } } }, $_ for keys + %groups; last if keys %sizes < 3; my @bySize = sort{$b<=>$a} keys %sizes; my %bySize = map{ $_ => 1 } @bySize; my $changed = 0; SIZE: for my $size ( @bySize ) { next if $size >= $ave; my $wanted = 1; ##int( $ave - $size + 0.5 ); { if( exists $bySize{ $wanted } ) { my $iToMove = shift @{ $sizes{ $size } }; my $iToAddto = shift @{ $sizes{ $wanted } }; push @{ $groups{ $iToAddto } }, @{ $groups{ $iToMo +ve } }; delete $groups{ $iToMove }; $changed++; last SIZE; } else { last if ++$wanted >= $size; redo; } } } unless( $changed ) { my $iToMove = shift @{ $sizes{ $bySize[ 0 ] } }; my $iToAddto = shift @{ $sizes{ $bySize[ 1 ] } }; push @{ $groups{ $iToAddto } }, @{ $groups{ $iToMove } }; delete $groups{ $iToMove }; } } return values %groups; } my @list = qw(1 1 1 2 3 4 4 4 5 5 6 6 6 7 7 8 8 8); my @lol = group( 6, @list ); print dump \@lol; __END__ C:\test>628986 [ [6, 6, 6], [3, 7, 7], [2, 5, 5], [8, 8, 8], [1, 1, 1], [4, 4, 4], ]

    It does not (yet) fare so well on all randomly generated sets, and I'm sure there is some fat that can be trimmed out/simplified, but I'm reluctant to spend more time on it if I've misunderstood your requirements?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Thanks so much for your efforts! I'm sorry to have been too vague - I think that my lack of clarity was part of why I could not seem to come to it on my own.

      I needed things to be in order, because the point is to create ranges of options, where the user will be able to select "1-2", or "3-4", etc, in order.

Re: How to Group List Items?
by Joost (Canon) on Jul 26, 2007 at 19:58 UTC
    Your definition is incomplete, you must assign priorities.

    For instance, you want each group to have the same number of items and you only want max 6 groups. If you only have 1 group containing all items, that would satisfy the requirements, but you probably don't want that :-)

    You also don't specify if you want to allow the same number to be in more than one group, though it looks like you don't, from your example.

    Update: you also in other words: you don't specify how you want to score the returned groups. For instance, why don't you prefer (pseudocode):

    [ [1 1 1 2 3], [4 4 4 5 5], [6 6 6 7 7], [8 8 8] ]
      Sorry - I'm looking to have 6 groups, no fewer.

      I'd like the groups scored by how many items they contain different from the ideal (the lower, the better):

      my $ideal_avg_per_group = @list / 6; # your solution my $solution = [ [1 1 1 2 3], [4 4 4 5 5], [6 6 6 7 7], [8 8 8] ]; my $this_score = 0; for (@$solution) { my $num_items = @$_; $this_score += abs($ideal_avg_per_group - $num_items); }
      Also, it's fine to assume that there are more than 6 unique values in the list. Otherwise, it's obviously very easy.

      Thanks!

        I still do not see where you are defining the concept of $ideal_avg_per_group. You can look at Algorithm::Bucketizer to see if that gets you anywhere -- but without a better definition of the problem it is anyones guess what a solution really is.


        -Waswas
Re: How to Group List Items?
by GrandFather (Saint) on Jul 27, 2007 at 02:16 UTC

    Don't know about best but:

    use strict; use warnings; use constant kMaxGroups => 6; my @list = qw(1 1 1 2 3 4 4 4 5 5 6 6 6 7 7 8 8 8); my $best_score = @list; my $best_combo; my $avgMembers = @list / kMaxGroups; my $nominalMembers = $avgMembers; while (my $combo = get_combo ($nominalMembers++, \@list)) { my $score = get_score ($avgMembers, $combo); # easy if ($score < $best_score) { $best_score = $score; $best_combo = $combo; } } print "Best combo score is $best_score\n"; print "@$_\n" for @$best_combo; sub get_combo { my ($target, $list) = @_; return undef if $target > @$list; my %values; $values{$_}++ for @$list; my @groups = map {[$values{$_}, [$values{$_}, $_]]} sort keys %val +ues; my $lastGroups = 0; while (@groups != $lastGroups) { $lastGroups = @groups; for my $index (0 .. $#groups - 1) { my $grp0members = $groups[$index][0]; my $grp1members = $groups[$index + 1][0]; next if $grp0members + $grp1members > $target; # Combine adjacent groups my @newGroup = ( $grp0members + $grp1members, [@{$groups[$index][1]}, @{$groups[$index + 1][1]}], ); splice @groups, $index, 2, \@newGroup; last; } } for my $group (@groups) { my @members; push @members, ($group->[1][1 + 2 * $_]) x $group->[1][2 * $_] for 0 .. @{$group->[1]} / 2 - 1; $group = \@members; } return \@groups; } sub get_score { my ($target, $solution) = @_; my $this_score = 0; $this_score += abs ($target - @$_) for @$solution; return $this_score; }

    Prints:

    Best combo score is 3 1 1 1 2 3 4 4 4 5 5 6 6 6 7 7 8 8 8

    DWIM is Perl's answer to Gödel
      It's going to take a while for me to figure out, but it seems to work like a charm. You rock. Yahoo!
Re: How to Group List Items?
by Ouato (Novice) on Jul 27, 2007 at 08:18 UTC
    Your problem looks like 'classification' problem, but as someone said, details about how you want to classify your items are missing.

    However you try looking at Wikipedia - Nearest neighbor search
    and take a good look at 'Space partitioning'
    Take care of items with the sames values.. maybe you shoud 'group' them by default when partitionning your space. If that's not exactly what you wanted try searching 'Classification algorithm' in Google or Wikipedia.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://628986]
Approved by almut
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2024-04-16 20:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found