in reply to Re: Turning A Problem Upside Down
in thread Turning A Problem Upside Down

BrowserUk,
assuming I understood the rules

I was just rudely awoken on my day off by work (brain not yet engaged) but I don't think you have. That's probably my fault an not yours. The object of the game is to in fact find subsets of the given set of letters (which is probably why I was focused there). Let me adjust the rules to see if they make more sense and I will give an example as well:

Original: You are presented with 7 randomly selected characters (dups allowed). The object is to come up with as many words comprised of 3 or more of those characters within a certain time-frame.

Updated: You are presented with 7 randomly selected characters which may contain duplicates. The object is to come up with as many words at least 3 characters long that are comprised entirely of a subset of the given characters in a certain time-frame.

Given: apetpxl # The following are all acceptable (albeit not the entire list) ape tap tape apple lap # The following are not acceptable because they contain letters not in + given apples sex

Cheers - L~R

Replies are listed 'Best First'.
Re^3: Turning A Problem Upside Down
by BrowserUk (Patriarch) on Aug 22, 2009 at 17:00 UTC

    Ah! So, a final check is required to eliminate words that use too many of any given letter. Could probably be achieved more efficiently, but since it's only run on a very short list and short circuits...

    #! perl -slw use strict; use Data::Dump qw[ pp ]; sub finalCheck{ my( $poss, $given ) = @_; $given =~ s[$_][] or return for split '', $poss; return 1; } sub uniq{ my %x; @x{@_} = (); keys %x } my @words = do{ local *ARGV = ['words.txt']; <> }; chomp @words; @words = grep length() > 2, @words; my %index; @index{ 'a' .. 'z' } = map chr(0) x int( ( @words + 8 )/8 ), 1 .. 26; for my $iWords ( 0 .. $#words ) { for my $char ( sort uniq split '', $words[ $iWords ] ) { vec( $index{ $char }, $iWords, 1 ) = 1; } } while( chomp( my $given = <STDIN> ) ) { my @given = split '', $given; my @excludes = grep{ !(1+index $given, $_ ) } 'a'..'z'; my $mask = chr(0) x int( ( @words + 8 )/8 ); $mask |= $_ for @index{ @given }; $mask &= ~ $index{ $_ } for @excludes; vec( $mask, $_, 1 ) and finalCheck( $words[ $_ ], $given ) and print $words[ $_ ] for 0 .. $#words; print "\n\n"; } __END__ fred def fed red ref apetpxl ale alp ape apex apple applet apt ate axe axle eat eta exalt lap lappet late latex lax lea leap leapt lept lepta let pal pale pap pat pate pea peal peat pelt pep pet petal plat plate plea pleat tale tap tape tax tea teal

    BTW: Your example helped, but is badly chosen. The problem was never including words that contained characters not in the supplied list (your 's'), but rather words that contained too many of one or more of the given letters. Eg. the second 'e' in 'freed' given 'fred'.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      BrowserUk,
      I am sorry my example still wasn't optimal. The dictionary I was using is the Scrabble Tournament Words List (TWL) as referenced in this somewhat related post. It comes in at just under 179,000 words. The Word Twister game only has two possible input lengths but I only play the 7 length version as it is the max. Of course, when I took up the quest of turning the problem on its head, I had already written a solver that could produce all valid words 4 or 5 times in the allocated period. My quest was to go beyond that and see what I could come up with if the dictionary and length of original input weren't so low. Again, this was just an academic exercise in exploration and my solution still assumes that the dictionary is relatively static even if it could be much larger.

      Thanks again for your responses. I will review them tonight when I have peace and quiet.

      Cheers - L~R

        The nearest wordlist I could find to that was this one. It takes a couple of seconds longer to index than my 90,000 word list, but the time taken to locate the words hardly changes at all as the main oparations remains ORing & AND NOTing 26 bitstrings (essentially O(1)), and then the final filtering:

        C:\test>wc -l TWL06.txt 178691 TWL06.txt C:\test>790206.pl apetpxl ale alp alt ape apex app appel apple applet apt ate axe axel axle eat eta exalt expat lap lappet lat late latex lax lea leap leapt lept lepta let lex pal pale palet palp pap pat pate pax pea peal peat pelt pep pepla pet petal plat plate plea pleat plex tae tael tale tap tape tax tea teal tel tela tepa tepal Found 64 words in 0.23

        You don't say what the "certain time frame" is, but presumably as this is intended for the human brain rather than computer, it is in the order of minutes rather than sub 1 second? Even if I feed in the entire alphabet--which means it must select all the words via the bitstrings and then filter out all those that contain duplicates. ie. worse case--it only takes just over 8 seconds:

        C:\test>790206.pl >nul abcdefghijklmnopqrstuvwxyz Found 36409 words in 8.46

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.