in reply to Challenge: Letter Power Revisited
Please note that these are snippets and not full solutions.
This is actually BrowserUk's approach from the original challenge.
This is extremely fast but has the disadvantage of not considering the letter frequency in the other categories. Choose The Word From Each Category That Has The Highest Density Of The Most Frequent Letters Across All Candidates#! perl -slw use strict; use Time::HiRes qw[ time ]; my $start = time; my @lookup = map{ ($_+1) * $_ /2 } 0 .. 80; sub calcScore { my( %chars, $score ); $chars{ $_ }++ for map{ split'' } @{ $_[0] }; $score += $lookup[ $_ ] for values %chars; return $score; } my @cats = map{ [ split ' ' ] } <DATA>; close DATA; ## Sort the words in each category by their individual scores (descend +ing) my %wordScores; @{ $cats[ $_ ] } = sort{ ( $wordScores{ $b } ||= calcScore [$b] ) <=> ( $wordScores{ $a } ||= calcScore [$a] ) } @{ $cats[ $_ ] } for 0 .. $#cats; ## form our guess from the highest scoring words in each category my @guess = map{ $cats[ $_ ][ 0 ] } 0 .. $#cats; my $best = calcScore \@guess;
This was my first attempt and runs extremely fast. Unfortunately, it doesn't consider that only 1 choice may be made from each category so it is possible for 1 category to throw off the frequency distribution.
Choose The Word From Each Category That Has The Highest Density Of The Most Frequent Letters By Categorymy (@candidate, %freq); while (<DATA>) { chomp; my @list = split ' '; push @candidate, \@list; for my $word (@list) { ++$freq{$_} for split //, $word; } } my $tot = sum(values %freq); $freq{$_} = ($freq{$_} / $tot) * 100 for keys %freq; my (@solution, %let); for my $list (@candidate) { my ($best, $item) = (0, ''); for my $word (@$list) { my $score = sum(map $freq{$_}, split //, $word); ($best, $item) = ($score, $word) if $score > $best; } ++$let{$_} for split //, $item; push @solution, $item; } my $total = sum(map {my $n = $let{$_}; ($n * $n + $n) / 2} keys %let); print "$total : [ @solution ]\n";
This is actually a refinement of the previous algorithm. Originally, if the letter 'e' made up 67% of all letters it would be heavily favored. This approach considers that the majority of those may be coming from 1 category where we can only choose 1 word. Now the letter 'e' would have to score high in the majority of categories for it to be so heavily favored.
As you can tell this code is pretty messy. I was more interested in proof of concept to compare answers rather than how clean the code was.my (@candidate, %freq); while (<DATA>) { chomp; my @list = split ' '; push @candidate, \@list; for my $word (@list) { ++$freq{$#candidate}{$_} for split //, $word; } } for my $idx (keys %freq) { my $tot = sum(values %{$freq{$idx}}); for my $let (keys %{$freq{$idx}}) { $freq{$idx}{$let} = ($freq{$idx}{$let} / $tot) * 100; } } for my $let ('a' .. 'z') { my $val = 0; for my $idx (0 .. $#candidate) { $val += $freq{$idx}{$let} if $freq{$idx}{$let}; } $freq{$let} = $val; } my (@solution, %let); for my $list (@candidate) { my ($best, $item) = (0, ''); for my $word (@$list) { my $score = sum(map $freq{$_}, split //, $word); ($best, $item) = ($score, $word) if $score > $best; } ++$let{$_} for split //, $item; push @solution, $item; } my $best = sum(map {my $n = $let{$_}; ($n * $n + $n) / 2} keys %let); print "$best : [ @solution ]\n";
This too is BrowserUk's approach from the original challenge. The basic idea is to see if changing the guess improves the score iteratively through each category. If yes, start over at the beginning.
This did the job for the original puzzle presented by Tanktalus. I felt there were a couple of things that could be changed for a general purpose solution. The first is that there is no escape clause if improvements are still being found. If an extreme edge case is used as input, then it is possible that the initial guess is far from the mark and it may as well be doing an exhaustive search anyway. The second thing is that the loop starts over if an improvement is found in the 3rd category without checking the rest. It is possible that leaving category 3 alone and changing category 4 would lead to an even better score. Explore All 1-Category Change Improvements (DFS With Escape Clause)## Loop over the categories swaping the other words in that cateory ## for the guessed word. If we get a better score, start again. LOOP: for my $iCat ( 0 .. $#cats ) { my $cat = $cats[ $iCat ]; for my $iWord ( 1 .. $#$cat ) { my $test = calcScore [ @guess[ 0 .. $iCat -1 ], $cat->[ $iWord ], @guess[ $iCat+1 .. $#guess ] ]; if( $test > $best ) { $best = $test; $guess[ $iCat ] = $cat->[ $iWord ]; print "$best : [ @guess ]"; redo LOOP; } } }
I extended BrowserUk's idea. Instead of starting over when an improvement is found, the improvement is pushed on to a work queue and the next category is checked. This effectively performs a depth-first search. Since this is even more likely to turn into an exhaustive search, I implemented two different escape clauses. The first limits the total number of iterations and the second limits the total run time. I will present the latter since it won't arbitrarily help or hinder results on platforms with more or less horse power.
my %seen; eval { local $SIG{ALRM} = sub { die "Timed Out\n"; }; alarm 60; my @work = [$best, \@solution]; while (@work) { my $item = pop @work; my ($cur_best, $cur_guess) = @$item; for my $idx (0 .. $#candidate) { for my $word (@{$candidate[$idx]}) { my @new_guess = @$cur_guess; $new_guess[$idx] = $word; next if $seen{"@new_guess"}++; my $new_tot = calc_score(@new_guess); if ($new_tot > $cur_best) { push @work, [$new_tot, \@new_guess]; if ($new_tot > $best) { $best = $new_tot; print "$best : [ @new_guess ]\n"; } } } } } alarm 0; };
Well, I hope someone finds this challenge interesting and presents a solution that dazzles and amazes. If you are looking for even more inspiration, there are several nodes in the original challenge worth looking at.
Cheers - L~R
|
|---|