ctfishman has asked for the wisdom of the Perl Monks concerning the following question:

I am attempting to use Perl to sort a group of sports teams using a number of criteria. The criteria are as follows:

1) Winning percentage (Highest to Lowest)
2) Number of Wins (Highest to Lowest)
3) Head to Head (if team A beat team B, then team A should be ahead of team B)
4) Strength of schedule rating (Highest to Lowest)
5) Number of wins of some specific various types
6) Tiebreaker Number (highest to lowest, a set value assigned to each team at the beginning of the program, is different for each team)

Using the built in sort allows me to do everything I want except deal with criteria 3 correctly. It is possible to have more then two teams with the same criteria 1 and 2. Those teams may or may not have played each other, so I could have:

teamA beat teamB
teamB beat teamC
teamC beat teamA

In this scenario I'd want to skip criteria 3, since no team has a clear advantage over all the others.

I could also have:

teamA beat teamB and teamC
teamB beat teamC

In that scenario criteria 3 needs to rank the teams as A, B, C

The most confusing possibility is:

teamA beat teamB and teamC
teamB beat teamC
teamD did not play teamA, teamB or teamC.

In that scenario I'd want the teams ranked by criteria 4, 5 and 6, but with the caveat that teamA has to be ahead of (but not necesarily directly next to) teamB and teamC, and teamB has to be ahead of teamC. So depending on criteria 4, 5 and 6, any of the following would be valid results:

teamA, teamB, teamC, teamD
teamA, teamD, teamB, teamC
teamA, teamB, teamD, teamC
teamD, teamA, teamB, teamC

So basically what I need to do is define an undertimined number of special rules that say "this team has to be ahead of (but not necessarily directly ahead of) this other team", where a specific "this team" or "this other team" could appear in more then one such rule.

Following is the current sort subroutine I'm using which doesn't handle all these scenarios:
sub rank_teams { if($wp{$a} < $wp{$b}) { return 1; } if($wp{$a} > $wp{$b}) { return -1; } if($wins{$a} > $wins{$b}) { $tiebreaker{$a} .= "Most Wins v. $b; "; re +turn -1; } if($wins{$a} < $wins{$b}) { $tiebreaker{$b} .= "Most Wins v. $a; "; re +turn 1; } if($wv{"a*b"} > $lv{"a*b"}) { $tiebreaker{$a} .= "Head-to-Head ($wv-$l +v) v. $b; "; return -1; } if($wv{"a*b"} < $lv{"a*b"}) { $tiebreaker{$b} .= "Head-to-Head ($lv-$w +v) v. $a; "; return 1; } if($sp{$a} > $sp{$b}) { $tiebreaker{$a} .= "Pts. v. $b; "; return -1; +} if($sp{$a} < $sp{$b}) { $tiebreaker{$b} .= "Pts. v. $a; "; return 1; } if($ll_wins{$a} > $ll_wins{$b}) { $tiebreaker{$a} .= "LL Wins ($ll +_wins{$a}-$ll_wins{$b}) v. $b; "; return -1; } if($ll_wins{$a} < $ll_wins{$b}) { $tiebreaker{$b} .= "LL Wins ($ll +_wins{$b}-$ll_wins{$a}) v. $a; "; return 1; } if($bbbDivision{$a} ne "LL") { if($l_wins{$a} > $l_wins{$b}) { $tiebreaker{$a} .= "L Wins ($l_win +s{$a}-$l_wins{$b}) v. $b; "; return -1; } if($l_wins{$a} < $l_wins{$b}) { $tiebreaker{$b} .= "L Wins ($l_win +s{$b}-$l_wins{$a}) v. $a; "; return 1; } } if($bbbDivision{$a} ne "LL"&& $bbbDivision{$a} ne "L") { if($m_wins{$a} > $m_wins{$b}) { $tiebreaker{$a} .= "M Wins ($m_win +s{$a}-$m_wins{$b}) v. $b; "; return -1; } if($m_wins{$a} < $m_wins{$b}) { $tiebreaker{$b} .= "M Wins ($m_win +s{$b}-$m_wins{$a}) v. $a; "; return 1; } } if($bbbDivision{$a} eq "S") { if($s_wins{$a} > $s_wins{$b}) { $tiebreaker{$a} .= "S Wins ($s_win +s{$a}-$s_wins{$b}) v. $b; "; return -1; } if($s_wins{$a} < $s_wins{$b}) { $tiebreaker{$b} .= "S Wins ($s_win +s{$b}-$s_wins{$a}) v. $a; "; return 1; } } if($tb_num{$a} > $tb_num{$b}) { $tiebreaker{$a} .= "By Lot v. $b; "; r +eturn -1; } if($tb_num{$a} < $tb_num{$b}) { $tiebreaker{$b} .= "By Lot v. $a; "; r +eturn 1; } return 1; }

Replies are listed 'Best First'.
Re: Adding Special Rules to Sort
by tilly (Archbishop) on Feb 28, 2009 at 17:38 UTC
    The only reasonable way to evaluate your head to head algorithm is to precompute which head to head comparisons are ambiguous, and which are not.

    The way to do that is to calculate the transitive closure of your head to head relation. Suppose you have a hash called %play_result whose keys are the winning teams and whose values are a hash of other teams played and the outcome of the game. Then you would compute the transitive closure as follows:

    # This will be our transitive closure. my %is_winning_chain_over; my @new_relations; # First populate with the winners for my $team (keys %play_result) { while (my ($other_team, $won) = each %{$play_result{$team}}) { if ($won) { $is_winning_chain_over{$team}{$other_team} = 1; push @new_relations, [$team, $other_team]; } } } # Now compute the closure while (@new_relations) { my $relation = shift @new_relations; my ($winner, $loser) = @$relation; my $winner_is_over = $is_winning_chain_over{$winner}; my $loser_is_over = $is_winning_chain_over{$loser}; for my $third_team (keys %$loser_is_over) { if (not $winner_is_over->{$third_team}) { $winner_is_over{$third_team} = 1; push @new_relations, [$winner, $third_team]; } } }
    And now in your comparison function you can check $play_result{$a}{$b} and not $is_winning_chain_over{$b}{$a} to see if $a should be ahead of $b by the head to head rule, and $play_result{$b}{$a} and not $is_winning_chain_over{$a}{$b} to see if $b should be ahead of $a. (And if neither is true, then go to conditions which are farther down.)

    Incidentally for many sports I am fond of the ELO rating system which is much simpler to calculate and takes into account things like whether you were playing hard teams.

Re: Adding Special Rules to Sort
by ELISHEVA (Prior) on Feb 28, 2009 at 17:00 UTC
    This isn't really a sorting problem, but a graph analysis problem. It may help to better understand this problem if you draw a diagram with circles representing the teams and arrows between the circles indicating who won (e.g. team to which arrow points is the winner).

    When you do this you will probably see a big spagetti mess, so try to rearrange the circles and lines so that any circle (team) that has only outgoing arrows (e.g. is always the looser) is on the edge of the graph.

    My guess is that you will quickly see that what you have is a network - that is a graph where there are nodes with multiple paths from the edge. You may also see circles within the graph - nodes where A beats C, C beats D, and D beats A.

    Sorting is basically an attempt to turn this mess - circles and all into a single straight line. There is no simple way to do it unless you set up some arbitrary rules about how you will deal with nodes that have multiple paths to the edge and nodes that are part of circles. The algorithm you come up with will also need to be especially careful to search for circles - otherwise you will find yourself in infinite loops trying to untangle this graph and convert it into a line.

    Writing the algorithm isn't terribly hard, but it will require either a recursive routine or a manually managed stack to keep track of where you are in the graph and what path you are currently following. If you have never written such an animal before be prepared to spend a lot of time debugging. Setting up some test cases and getting acquainted with Test::More (if you aren't already) may also be a good idea and save you some time.

    If I find something on the web that gives a step-by-step descriptions of how to write up these kinds of algorithms, I'll add it. But hopefully, someone else knows of such links and will add them before I find one.

    Best, beth

Re: Adding Special Rules to Sort
by Herkum (Parson) on Feb 28, 2009 at 15:23 UTC

    I have written this same sort of formula before and it is actually pretty easy as long as you avoid the working with the A vs B vs C headache.

    sub _team_standings { $b->get_wins_total <=> $a->get_wins_total or $b->get_wins_division <=> $a->get_wins_division or $b->get_wins_home <=> $a->get_wins_home or $b->get_run_difference <=> $a->get_run_difference or $b->get_runs_scored <=> $a->get_runs_scored or $a->get_reg_schedule_id <=> $b->get_reg_schedule_id }

    If you notice, I am sorting my teams all based upon some team stat except the last one. There I am using a schedule id as my tie breaker instead of a head-to-head comparison.

    The problem with the head-to-head comparison is when you have 3 or more teams tied together and you are trying to determine who would be ahead, it creates a catch-22 scenario. Example:

    A beat B B beat C C beat A

    If you are trying to use head-to-head comparison, there is no way to resolve it. You are basically down to using a coin flip or some other stat to resolve the issue.

    For the sake of simplicity, I would ignore using head-to-head comparisons and look at other values instead.

Re: Adding Special Rules to Sort
by hangon (Deacon) on Feb 28, 2009 at 22:28 UTC

    I don't have any more insight about a head-to-head algorithm, but here's a better way of dealing with an arbitrary number of rule sets. Each rule is defined in a separate subroutine and an array dispatch table is used to evaluate them sequentially. If a rule's result is indeterminate, the next rule is tried. The following code is not rigorously tested, but should give you the idea.

    my @teams = qw(mets yankees sox cubs braves cards); my @rules = ( \&wins, \&losses, \&headtohead, \&games, \&tiebreaker ); # games played, wins, losses, ties, tie breaker, teams defeated my %stats = ( braves => [20, 18, 2, 0, 1, {yankees => 1, cubs => 1, etc => 1} + ], yankees => [20, 18, 2, 0, 2, {braves => 1} ], mets => [20, 18, 2, 0, 3, {cubs => 1} ], sox => [20, 8, 12, 0, 4, {yankees => 1} ], cubs => [20, 8, 12, 0, 5, {cards => 1} ], cards => [20, 15, 5, 0, 6, {sox => 1} ] ); my @standing = sort{ my $cv; for (@rules){ $cv = $_->($a, $b); last if $cv; } return $cv; }@teams; print "@teams\n"; print "@standing\n"; ## The Rules: sub wins{ my ($a, $b) = @_; return $stats{$b}[1] <=> $stats{$a}[1]; } sub losses{ my ($a, $b) = @_; return $stats{$a}[2] <=> $stats{$b}[2]; } sub headtohead{ my ($a, $b) = @_; my $win = 0; $win-- if $stats{$a}[5]{$b}; $win++ if $stats{$b}[5]{$a}; return $win; } sub games{ my ($a, $b) = @_; return $stats{$a}[0] <=> $stats{$b}[0]; } sub tiebreaker{ my ($a, $b) = @_; return $stats{$a}[4] <=> $stats{$b}[4]; }
Re: Adding Special Rules to Sort
by zentara (Cardinal) on Feb 28, 2009 at 14:35 UTC
    See Tricky sorting of HOH.. There is a recipe for stringing together sorts, connected with || and or, to custom sort any HoH. Google for "perl sort HoH". Also look at the tutorial section here, and search for "sort", there are many recipes in there.

    I'm not really a human, but I play one on earth My Petition to the Great Cosmic Conciousness
      How on Earth is this meant to be an answer to the question that is being asked?

      It is clear to me that the OP knows very well how to sort in Perl, and how to do tricky sorts with complex combinations of criteria. The problem is how to do a sort based on the head to head criteria. Your answer does nothing to help with that problem.

        I hate to disagree

        One part to his multipart criteria..... Using the built in sort allows me to do everything I want except deal with criteria 3 correctly. ......i.e. head to head wins

        Now all he has to do is setup a hash key on who won head to head......the most creative way to do this, is up to debate.

        My answer pointed the way to sorting HoH, and yes, I did not specifically give code to do this, but I feel he can figure that much out, or hire someone who can. I will do it for a small fee.n ;-) Proper setup, and reliability testing is not some easy code, as the rest of the nodes show, there are alot of variables.....head to head wins in pre-season, regular season, post season, etc....alot of complications. Not to mention his additional sorting criteria.


        I'm not really a human, but I play one on earth My Petition to the Great Cosmic Conciousness