in reply to Re: Challenge: Letter Power
in thread Challenge: Letter Power

BrowserUk,
I really like this heuristic approach. Take a best guess and then see if it can be refined to get a better score. I am thinking about posting a "Challenge: Letter Power Revisited" thread so I have been considering how to generate puzzles at random. I was able to find an example where your solution does not provide the best solution in just 3 tries:

modems decamp saddens invokes paw odysseys fuddled absorbs delude eyes +ore cleat cloaks clincher sometime mating bushel shinnied probably fuckers lummo +xes uptown caseload hawker sleekest buzzard bartered songbird pilgrim steepen pithily bruiser shirking kin +ks hooking chalice town clutch mongrels frig pine fibre spindles doughy printing burned rendered warhorse spooked glumness solarium baseless roared steward lo +am carillon ratting inhabits examiner marshals rockiest jemmied zappy alcohols fixers refiners modeler outlook pastels spin prior paired gli +ders wags demean intend stalest streaker twinset equated negated quenched sheered accord shambles capably raffia page washer brownie hussar etches tethe +rs track lifts ricks knows giblets dozy pacify exhausts haulage squires bologna stacking __DATA__ 335 : [ eyesore sleekest bartered printing rendered marshals alcohols +streaker shambles exhausts ] 340 : [ eyesore sleekest steepen printing rendered marshals alcohols s +treaker shambles exhausts ] 374 : [ eyesore sleekest steepen spindles rendered marshals alcohols s +treaker shambles exhausts ] 402 : [ eyesore sleekest steepen spindles baseless marshals alcohols s +treaker shambles exhausts ] 424 : [ eyesore sleekest steepen spindles baseless marshals refiners s +treaker shambles exhausts ] 428 : [ eyesore sleekest steepen spindles baseless marshals pastels st +reaker shambles exhausts ] 434 : [ eyesore sleekest steepen spindles baseless marshals pastels sh +eered shambles exhausts ]
It can be show that a better solution exists (not necessarily best):
438 : [ eyesore sleekest steepen spindles baseless rockiest refiners s +heered tethers exhausts ]

My heuristic solution is very similar to yours. Rather than picking the best individual score from each group, I calculate the letter frequency across all words. Then I select the word from each group that has the highest score based on that frequency analysis. I have also modified the refinement loop to limit the number of loops to handle the edge case where picking the best individual scores is the worst thing to do.

LOOP: for (1 .. $max_loops) { # your refinement loop if ($better) { # update best next LOOP; } # end of your refinement loop last; }
This way you pick a "good enough" score in a reasonable time frame no matter what.

I am now going to try and figure out if I can do a smart exhaustive search to guarantee best solution in a reasonable runtime (less than 2 hours). If I can, I will post a new thread comparing naive brute force, optimized brute force, heuristic, optimized heuristic.

Cheers - L~R

Replies are listed 'Best First'.
Re^3: Challenge: Letter Power
by BrowserUk (Patriarch) on Apr 18, 2009 at 04:41 UTC
    It can be show that a better solution exists (not necessarily best):

    438 : [ eyesore sleekest steepen spindles baseless rockiest refiners sheered tethers exhausts ]

    Yep! That's the problem with heuristic solutions. Unless you also run a full brute force on the same data, you can never be sure that you haven't found some local maxima.

    On the dataset above, your 438 is better that my 434, but neither are as good as:

    441 : [ rendered spindles squires tethers rockiest eyesore sheered sleekest steepen refiners ]

    But even using all 4 cpus flat out, my brute force solution can just about manage 1/2 a million combinations/minute. Given there are 7.75e9 combinations to try, I'm projecting over 10 days to verify the heuristic for that one dataset. That's too much for that sake of an intellectual challenge with no useful purpose.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      BrowserUk,
      That's the problem with heuristic solutions.

      Heuristic solutions can be improved. For instance, the following modification to your refinement algorithm shows an even better solution:

      It finds the following in under a second and finishes off the work queue in about 17.

      446 : [ odysseys sleekest steepen spindles baseless marshals pastels s +talest shambles exhausts ]

      I am still looking for a trick given a specific dictionary (60K words) to be able to produce a guaranteed optimal solution in a reasonable run time (see How many words does it take? for an example of such a trick). In the interim I have been exploring a number of heuristic solutions and refinement algorithms.

      Your heuristic approach with my refinement is always able to tie or beat the "best" solutions of the other heuristic approaches. Unfortunately, in the cases when there is a tie, the other approaches reach the optimal solution faster. I am trying to figure out if there is a way to prune the work queue above so that it produces the same results faster.

      If you are interested in the approaches I am exploring (either for heuristic or guaranteed optimal) let me know. I am seldom interesting in activities with "useful purposes" so don't feel obliged to indulge me if you aren't interested :-)

      Cheers - L~R

        Heuristic solutions can be improved.

        You're still missing the point. You can only know when to stop looking for an improvement (either through changes to the algorithm or further iterations), if you have some target to aim for. Maybe 446 is the best for this particular dataset. Or maybe, there is another 2 point gain to be had? Or a 20 point gain? Or 200? Maybe there are hundreds of better solutions to this particular dataset. Without running an exhaustive search you have no way of judging how good any particular heuristic is for any given dataset.

        And whilst your modifications find better solutions for this dataset, maybe my original will work better for a different dataset? Or many different datasets. And without an exhaustive search of a representative sample of datasets, you've no way to compare the efficacy of different heuristics.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.