The following needs a great deal of polish. I took ikegami's basic regexp as a superset of the words you're looking for. Then I horribly misused it without all the safety that ikegami was right to include (qr// syntax, etc.).

I also ended up turning your semantic inside out in that the dictionary file is now on the command line and the starting word is hard coded; sorry about that.

The guts of this solution involve counting letters in a regexp:

@matches = ($input =~ m/ (.{1}?) (?{ $chars{$^N}++; }) /xg);

produces a hash, %chars, that counts the frequency of letters in each word in ikegami's set.

Earlier I've got a hash, %wordCharCount, using the more compact

map {$wordCharCount{$_}++} (split(//,$word));

that does basically the same thing on the original word. This doesn't work inside the while (<>), I think because map reuses $_.

Then I loop through the keys of the smaller hash (%chars) and be sure that each letter hasn't been used too many times. If it's still good, I print it as a hit.

Apologies to ikegami for marring his elegant solution with this horrid hack. I'm sure there are more careful ways to handle the input file and regexp and (gah) the booleans.

use strict; use warnings; use Data::Dumper; my $word='perlmonks'; my %wordCharCount; my @matches; my $input; my %chars; my $boolean=1; my $k; map {$wordCharCount{$_}++} (split(//,$word)); while (<>) { $input = $_; if (/^[$word]*$/) { #print "candidate: $_"; @matches = ($input =~ m/ (.{1}?) (?{ $chars{$^N}++; }) /xg); foreach $k (keys %chars) { $boolean = ($wordCharCount{$k} >= $chars{$k}?1:0) && $bool +ean; # print "$wordCharCount{$k} <=> $chars{$k} yields $boolean\ +n"; } print if $boolean; #print "hit: $_" if $boolean; undef %chars; $boolean=1; } }

I named this lbu.pl for "letter bank unique", which, I think, is the type of puzzle this is. My word list is words.knu, and here's the results:

C:\chas_sandbox>lbu.pl WORDS.KNU e el elk elks elm elms els em en enol eon eons k kelp ken keno kern kerns l lemon lemons lens les lo lone loner loners lop lope loper lops lore lorn lose loser me melon melons men mens mer meson ml mole moles monel moner monk monks mop mope moper mopes mops more morel mores morn morsel n ne neo no noes nope nor norm norms nose o om omen omens omer on one ones ons op open opens or ore ores ors pel pels pen pens peon per perk perm person peso poem poems poke poker pokes pol pole poles pome pon pons pore pores pork porn pose poser pre pro prole prom prone pros prose r re rep reps roe roes role roles romp romps rope ropes rose s sen senor ser sermon skelp skep sloe slop slope sloper smoke smoker snore snorkel so soke sole solemn soln some son sone sop sore sperm splore spoke spoken spore


I humbly seek wisdom.

In reply to Re^3: Simple regex wordlist question by goibhniu
in thread Simple regex wordlist question by escherist

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.