Just another Perl shrine PerlMonks

### Pangram challenge: greed and Scrabble

by Athanasius (Archbishop)
 on Apr 24, 2022 at 14:43 UTC Need Help??

Task 2 of the current Perl Weekly Challenge is to generate a pangram from the English dictionary provided. “A pangram or holoalphabetic sentence is a sentence using every letter of a given alphabet at least once.”1 However, for this challenge the requirement that the words form a sentence is removed: a list of unrelated words is all that is required.

The bonus challenge is to “Constrain or optimize for something interesting,” and the first suggestion given is:

Shortest possible pangram (difficult)

Ok, so just how difficult would it be to perform an exhaustive search? I don’t know the answer, but since the search space increases exponentially as words are added to the dictionary, a search for sequences (of varying lengths) drawn from a pool of 23,610 words (see below) would appear highly impractical. I haven’t attempted it.

Before we proceed, it should be noted that the dictionary can be significantly reduced in size before the search begins. Where groups of words share the same set of letters, only one word from the group (the shortest) need be kept. For example, “abdicate,” “diabetic,” and “abdicated” all have the letter set {a, b, c, d, e, i, t}, so “abdicate” is retained but “diabetic” and “abdicated” are filtered out. This reduces the dictionary size from 39,172 to 23,610 — a saving of 15,562 words!

As noted above, 23,610 is still far too many words for an exhaustive search. So we’re looking for a way to get a close-to-optimum result in a reasonable time. And as we’ve just noted, each dictionary word can be considered as a set of its component letters. So the task is to find a minimum set cover, a well-studied problem in combinatorics and computer science.2 And there is an algorithm that is known to give good results in polynomial time: the greedy algorithm:

at each stage, choose the set that contains the largest number of uncovered elements.

My initial implementation of this algorithm produced the following pangram:

oversimplification ladybug hawk jazz equinox (40 letters)

I wasn’t satisfied with this, so began looking for ways to improve the search. It occurred to me that words with rare letters should be preferred over words with common letters. This immediately suggested the Scrabble3 points system:

```const my %ALPHABET =>
(
a => 1, b =>  3, c => 3, d => 2, e => 1,
f => 4, g =>  2, h => 4, i => 1, j => 8,
k => 5, l =>  1, m => 3, n => 1, o => 1,
p => 3, q => 10, r => 1, s => 1, t => 1,
u => 1, v =>  4, w => 4, x => 8, y => 4, z => 10
);
[download]```

With this I was able to give each word a weighting based on its component letters, and prefer words with higher weightings. This produced the following pangram:

sympathized quarterbacking reflexive jawbone (41 letters)

— no improvement at all! The problem appears to be that the selected words contain too much deadweight — too many redundant letters. So would it help to adjust the weighting for each word by deducting points for redundant letters? I experimented with different values of \$DUP_WEIGHT, the amount to deduct for each redundant letter. Here are the results:

\$DUP_WEIGHT Pangram Letter Count
1 sympathized quarterbacking jinx overflow 37
2 sympathized quacking overflow jinx be 33
3 sympathized quacking fox jaw verb all 32
4 sympathized unblock xv jaw qt frog 29

Eureka! At 29 letters, the last pangram is only 3 letters above the theoretical best-possible result of 26. (For values of \$DUP_WEIGHT above 4, the result does not change.)

It should be emphasized that my improved results are not an improvement on the greedy algorithm. That algorithm applies to sets, whereas my search for pangrams is a search over words, which are essentially bags because they contain multiple instances of the same letters.

For the implementation, I found the Set::Tiny module very useful. I highly recommend it for working with sets of strings.

Wikipedia1 gives the following as a short pangram:

Waltz, bad nymph, for quick jigs vex. (28 letters)

Since all these words occur in the given dictionary, it would be possible for a search to improve on my best solution by at least one letter. Challenge: can anyone implement a search which finds this pangram (or a shorter one) within a reasonable time?

Full code listings for my solution (in Perl and Raku) will shortly be are available on the Weekly Challenge’s GitHub repository at https://github.com/manwar/perlweeklychallenge-club/tree/master/challenge-161/athanasius.

Cheers,

1“Pangram, ” Wikipedia, https://en.wikipedia.org/wiki/Pangram
2“Set cover problem,” Wikipedia, https://en.wikipedia.org/wiki/Set_cover_problem
3 “Scrabble letter distributions,” Wikipedia, https://en.wikipedia.org/wiki/Scrabble_letter_distributions#English

 Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

Replies are listed 'Best First'.
Re: Pangram challenge: greed and Scrabble
by tybalt89 (Prior) on Apr 25, 2022 at 15:15 UTC

In playing around with various versions of this, I found that most 'good' solutions have both 'qt' and 'xv' in them, so this version starts with that. '@words' contains just the words with no duplicated letters. This code just picks those words from @words that have no previously used letters, and keeps adding words until it can't. Lines with 26 letters are declared WINNER!. The code at the end and the __DATA__ section are just for verifying that previously discovered lines are valid.

This code is finding a 26 every 5 to 10 minutes, roughly.

```#!/usr/bin/perl

use strict; # https://perlmonks.org/?node_id=11143246
use warnings;
use List::AllUtils qw( uniq uniq_by );

my @anagrams = uniq_by {join '', sort /./g}
do{ local (@ARGV, \$/) = 'dictionary.txt'; <> =~ /^\w+\$/gm };
my @words = grep ! /(.).*\1/, @anagrams;
print @anagrams . " anagrams and " . @words . " uniletter words\n";
my @best;
my \$done;
\$SIG{INT} = sub {\$done = 1};

#for my \$w ( map { (\$_) x 1e4 } grep /^qt\$/, grep /q/, @words )
for my \$w ( ('qt xv') x 1e7 )
{
my @new = @words;
my \$line = \$w;
while( 1 )
{
my @letters = uniq \$line =~ /\w/g;
my @new = grep ! /[@letters]/, @new; # ignore if existing letters
@new or last;
\$line .= " " . \$new[rand @new];
}
my @all = uniq \$line =~ /\w/g;
@all >= 26 and print "WINNER!!\n";
@all >= 25 and print @all . " \$line\n";
@all >= \$#best and \$best[@all] .= "\$line\n";
\$done and last;
}

print "\n\nBEST \$#best\n", \$best[-1];

print "\n";
for ( <DATA> )
{
my @dups = /(\w)(?=.*\1)/g;
printf "%-40s length: %2d  dups: (%s)\n", s/\n//r, tr/a-z//, "@dups"
+ ;
}

# these are found 26 letter sentences from various versions
__DATA__
qt jokes zinc flag rpm why bud xv
qt xv dumps chalk frenzy jig bow
qt xv grandly hump wicks job fez
qt xv bled frank cum why zip jogs
qt xv zoned brag whys flick jump
qt xv cranked jugs bop whiz m fly
qt xv spindly chow bark m jug fez
qt xv junk child prows bag my fez
qt xv furls peck dazing m why job
qt xv crumbly ponds hawk fez jig
qt xv lynched frog jump bask wiz
qt xv cods blink rpm why jug a fez
qt xv gambled whiz junk cops fry
qt xv brazens flip jug why dock m
qt xv frenzy jambs howl puck dig
qt xv hunks comb ply draw jig fez
qt xv dawns chum flog jerk by zip
qt xv folks whiz pend cram by jug
qt xv pucks lard hymn bow fez jig
qt xv block daring jumps fez why
qt xv fazing clerk dumps why job
qt xv dampen blocks jug whiz fry
qt xv cashed junk flog rpm by wiz
qt xv parked wiz ms bunch jog fly
qt xv drops clef jab wiz gym hunk
qt xv fled burn jack ms zip go why
qt xv golfed cramps junk whiz by
qt xv grazed flunk cs whip my job
qt xv spanked whiz fry m jog club
qt xv punks decry job flag m whiz
qt xv blocs germ fad zip why junk
qt xv cranked ms gulf zip job why
qt xv jambs pend lock hug wiz fry
qt xv chasm glib pod wry junk fez
qt xv warns hold fez jug pick by m
qt xv flunks gym carped whiz job
qt xv brazen flick jugs pod m why
qt xv gowns humbled jack zip fry
qt xv flocked spy barn jug whiz m
qt xv sunk wild parch fez jog m by
qt xv drugs why fleck an zip m job
qt xv zincs bug lymph dwarf joke
qt xv fly bronzed whacks m jug pi
qt xv don bumps chalk wry jig fez
qt xv job whacks fen zip lug m dry
qt xv gulped fry m snack whiz job
qt xv drench saw folk jug zip by m
qt xv barking clods jump why fez
qt xv pricks glazed job fun m why
qt xv dos whiz clap beg junk m fry
qt xv wizards fleck hymn jug bop
qt xv flesh crank pudgy job m wiz
qt xv darn ply hicks womb jug fez
qt xv land chimp brow jug fez sky
qt xv wrongly hacks jump bid fez
qt xv gulf nymphs racked job wiz
qt xv sky lamp birch down jug fez
qt xv hybrid jack plows fez m gnu
qt xv jar zip fleck buns why dog m
qt xv snugly whack drip m fez job
qt xv lawn hick jogs dumb fez pry
qt xv wrongly dish jack m pub fez
qt xv wiz blonds peck ham jug fry
qt xv molds whack zip fern jug by
qt xv draws jig hock fez bun m ply
qt xv blocks jump and fez why rig
qt xv workbench jams fly zip dug
qt xv block mind jaw spry fez hug
qt xv dam fleck shy wrung job zip
qt xv dingy hulks rpm fez caw job
qt xv drags clef knob jumpy whiz
qt xv whiz neck pro dabs fly m jug
qt xv blazing how decks fry jump
qt xv frenzy jump clog hawks bid
qt xv fogs crazed blip m why junk
qt xv chipmunks drawl jog by fez
qt xv when zip log ducks jamb fry
qt xv neck drags fly up whiz m job
qt xv bucks flared nymph wiz jog
qt xv bend flash rpm yuck wiz jog
qt xv drench bulks jaw fog my zip
qt xv branch dump low jig fez sky
qt xv gumdrops why clink jab fez
qt xv gyms brunched folk zip jaw
qt xv men grasp duck job whiz fly
qt xv calf shrew junk dog zip m by
[download]```

Sample Output:

```36523 anagrams and 9836 uniletter words
25 qt xv combed pranks flu why jig
25 qt xv hawks bog jumpy rind clef
25 qt xv duck wrens flap him jog by
25 qt xv gelds work jay bunch zip m
25 qt xv curbs flanked why m go zip
25 qt xv pens balmy frock whiz jug
25 qt xv chunks job pad grim we fly
25 qt xv creaks bond m whiz ply jug
25 qt xv doze brash puck wing m fly
25 qt xv jangled spry wiz bum hock
25 qt xv jumped flock rang whiz by
25 qt xv clasped frog hunk by m wiz
25 qt xv plumbed choking fry jaws
25 qt xv jibed chaps flunky m grow
25 qt xv wry hip flag junked cs mob
25 qt xv chinks daze plumb jog wry
25 qt xv forks paw lend chum jig by
25 qt xv dolphins jaw bug fez cry m
25 qt xv cashew bird flop gym junk
25 qt xv grands jump clef wiz by ho
25 qt xv blocks find gym her jaw up
WINNER!!
26 qt xv daze nymph flow bricks jug
25 qt xv jumble frock spading why
25 qt xv franks behold gym cup wiz
25 qt xv refunds gym blow hack zip
25 qt xv cob flask dine rpm why jug
25 qt xv fazing cords kelp why bum
25 qt xv shingled fry puck jaw mob
^C

BEST 26
qt xv daze nymph flow bricks jug

qt jokes zinc flag rpm why bud xv        length: 26  dups: ()
qt xv dumps chalk frenzy jig bow         length: 26  dups: ()
qt xv grandly hump wicks job fez         length: 26  dups: ()
qt xv bled frank cum why zip jogs        length: 26  dups: ()
qt xv zoned brag whys flick jump         length: 26  dups: ()
qt xv cranked jugs bop whiz m fly        length: 26  dups: ()
qt xv spindly chow bark m jug fez        length: 26  dups: ()
qt xv junk child prows bag my fez        length: 26  dups: ()
qt xv furls peck dazing m why job        length: 26  dups: ()
qt xv crumbly ponds hawk fez jig         length: 26  dups: ()
qt xv lynched frog jump bask wiz         length: 26  dups: ()
qt xv cods blink rpm why jug a fez       length: 26  dups: ()
qt xv gambled whiz junk cops fry         length: 26  dups: ()
qt xv brazens flip jug why dock m        length: 26  dups: ()
qt xv frenzy jambs howl puck dig         length: 26  dups: ()
qt xv hunks comb ply draw jig fez        length: 26  dups: ()
qt xv dawns chum flog jerk by zip        length: 26  dups: ()
qt xv folks whiz pend cram by jug        length: 26  dups: ()
qt xv pucks lard hymn bow fez jig        length: 26  dups: ()
qt xv block daring jumps fez why         length: 26  dups: ()
qt xv fazing clerk dumps why job         length: 26  dups: ()
qt xv dampen blocks jug whiz fry         length: 26  dups: ()
qt xv cashed junk flog rpm by wiz        length: 26  dups: ()
qt xv parked wiz ms bunch jog fly        length: 26  dups: ()
qt xv drops clef jab wiz gym hunk        length: 26  dups: ()
qt xv fled burn jack ms zip go why       length: 26  dups: ()
qt xv golfed cramps junk whiz by         length: 26  dups: ()
qt xv grazed flunk cs whip my job        length: 26  dups: ()
qt xv spanked whiz fry m jog club        length: 26  dups: ()
qt xv punks decry job flag m whiz        length: 26  dups: ()
qt xv blocs germ fad zip why junk        length: 26  dups: ()
qt xv cranked ms gulf zip job why        length: 26  dups: ()
qt xv jambs pend lock hug wiz fry        length: 26  dups: ()
qt xv chasm glib pod wry junk fez        length: 26  dups: ()
qt xv warns hold fez jug pick by m       length: 26  dups: ()
qt xv flunks gym carped whiz job         length: 26  dups: ()
qt xv brazen flick jugs pod m why        length: 26  dups: ()
qt xv gowns humbled jack zip fry         length: 26  dups: ()
qt xv flocked spy barn jug whiz m        length: 26  dups: ()
qt xv sunk wild parch fez jog m by       length: 26  dups: ()
qt xv drugs why fleck an zip m job       length: 26  dups: ()
qt xv zincs bug lymph dwarf joke         length: 26  dups: ()
qt xv fly bronzed whacks m jug pi        length: 26  dups: ()
qt xv don bumps chalk wry jig fez        length: 26  dups: ()
qt xv job whacks fen zip lug m dry       length: 26  dups: ()
qt xv gulped fry m snack whiz job        length: 26  dups: ()
qt xv drench saw folk jug zip by m       length: 26  dups: ()
qt xv barking clods jump why fez         length: 26  dups: ()
qt xv pricks glazed job fun m why        length: 26  dups: ()
qt xv dos whiz clap beg junk m fry       length: 26  dups: ()
qt xv wizards fleck hymn jug bop         length: 26  dups: ()
qt xv flesh crank pudgy job m wiz        length: 26  dups: ()
qt xv darn ply hicks womb jug fez        length: 26  dups: ()
qt xv land chimp brow jug fez sky        length: 26  dups: ()
qt xv wrongly hacks jump bid fez         length: 26  dups: ()
qt xv gulf nymphs racked job wiz         length: 26  dups: ()
qt xv sky lamp birch down jug fez        length: 26  dups: ()
qt xv hybrid jack plows fez m gnu        length: 26  dups: ()
qt xv jar zip fleck buns why dog m       length: 26  dups: ()
qt xv snugly whack drip m fez job        length: 26  dups: ()
qt xv lawn hick jogs dumb fez pry        length: 26  dups: ()
qt xv wrongly dish jack m pub fez        length: 26  dups: ()
qt xv wiz blonds peck ham jug fry        length: 26  dups: ()
qt xv molds whack zip fern jug by        length: 26  dups: ()
qt xv draws jig hock fez bun m ply       length: 26  dups: ()
qt xv blocks jump and fez why rig        length: 26  dups: ()
qt xv workbench jams fly zip dug         length: 26  dups: ()
qt xv block mind jaw spry fez hug        length: 26  dups: ()
qt xv dam fleck shy wrung job zip        length: 26  dups: ()
qt xv dingy hulks rpm fez caw job        length: 26  dups: ()
qt xv drags clef knob jumpy whiz         length: 26  dups: ()
qt xv whiz neck pro dabs fly m jug       length: 26  dups: ()
qt xv blazing how decks fry jump         length: 26  dups: ()
qt xv frenzy jump clog hawks bid         length: 26  dups: ()
qt xv fogs crazed blip m why junk        length: 26  dups: ()
qt xv chipmunks drawl jog by fez         length: 26  dups: ()
qt xv when zip log ducks jamb fry        length: 26  dups: ()
qt xv neck drags fly up whiz m job       length: 26  dups: ()
qt xv bucks flared nymph wiz jog         length: 26  dups: ()
qt xv bend flash rpm yuck wiz jog        length: 26  dups: ()
qt xv drench bulks jaw fog my zip        length: 26  dups: ()
qt xv branch dump low jig fez sky        length: 26  dups: ()
qt xv gumdrops why clink jab fez         length: 26  dups: ()
qt xv gyms brunched folk zip jaw         length: 26  dups: ()
qt xv men grasp duck job whiz fly        length: 26  dups: ()
qt xv calf shrew junk dog zip m by       length: 26  dups: ()
[download]```
Re: Pangram challenge: greed and Scrabble
by Limbic~Region (Chancellor) on May 12, 2022 at 18:45 UTC

Log In?
 Username: Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://11143246]
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2022-07-02 20:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
My most frequent journeys are powered by:

Results (103 votes). Check out past polls.

Notices?