Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
This is a real challenge (at least for me).
Finding (multiple) words that are anagrams of a given phrase is much more complex than finding the simple anagrams. However, we can do something, at least to find up to three words corresponding to a given input.
pseudocode
-------------------------------------------------------- calculate inputphrase signature while (input available) read word skip if words letters not in inputphrase calculate signature and anagrams for each candidate word # first pass for each candidate word # second pass for each candidate word # third pass combine candidate signatures skip if different from inputphrase signature add to anagrams list print anagrams list --------------------------------------------------------
Example: given a list containing the following words
cere cheer come cree echo eeoc em he here herm home me mere moe re reme rhee rho
We could issue a command
perl mul_anagram.pl "come here" < words
and we would get this result:
[come] (here rhee) [echo] (mere reme) [eeoc] (herm) [home] (cere cree) [moe] (cheer) [re] (come) <he> [re] (echo) <em me> [rho] (emcee)
From which it is easy to pick "cheer Moe" or "mere echo" or "re: echo me"
Notice that the words have different parentheses: "[]" come from the first pass, "()" from the second and "<>" from the third one. It means that, to have your complete anagram, you should pick one word from each different parenthesis.
Using the same word list mentioned in the main node, I got these:
monk friar norm fakir marin fork rank of rim koran firm lithographic alight orphic goliath chirp high tropical oligarch pith pig haircloth perlmonks.net tromp kernel monk saint main knots mason knit Perl forever lover prefer reprover elf repel fervor Use Perl forever profuse reveler reefer overplus reprove refuels sleeper fervour wait for experience firepower exitance son of a gun snafu goon GNU on sofa Brother Tilly try their boll try other bill lit Tyrol herb
Enough chatting. Here's the code.
#!/usr/bin/perl -w use strict; my $phrase = shift || die "input phrase required\n"; my $outer_limit = shift || 1500; my $inner_limit = shift || 100; $phrase = lc $phrase; $phrase =~ tr/a-z//cd; # considers only alpha characters my @input_letters = split //, $phrase; my $signature = join "", sort @input_letters; my %words = (); my %compare_template; for (@input_letters) {$compare_template{$_}++}; INPUT: while (<>) { chomp; $_ = lc $_; my @letters = split //, $_; my $windex = join "", sort @letters; my %compare = %compare_template; for my $let (@letters) { next INPUT unless (exists $compare{$let}) # keeps only words made of and $compare{$let}--; # signature letters } if (exists $words{$windex}) { next if $words{$windex} =~ /\b$_\b/; $words{$windex} .= " "; } $words{$windex} .= $_; } my $items = scalar keys %words ; print STDERR "Considering $items items. "; if ($items > $outer_limit) { print "Too many candidates. It would take too long\n"; exit; } print STDERR @{[$items > $inner_limit ? "Only two" : "Three"]}, " passes\n"; my @candidates = keys %words; my @used = (); # stores the combination of words already found for my $first (0 .. $#candidates) { if ($signature eq $candidates[$first]) { print " [" . $words{$candidates[$first]} . "]\n"; push @used , [$first, -1,-1]; next } for my $second (0 .. $#candidates) { next if $second == $first; next if grep { (grep {$_ == $first} @$_) and (grep {$_ == $second} @$_)} @used; my $sign = join "", sort split //, $candidates[$first].$candidates[$second]; if ($sign eq $signature) { print " [" . $words{$candidates[$first]}. "] (" . $words{$candidates[$second]}. ") \n"; push @used, [$first, $second, -1]; next; } if ($items <= $inner_limit) { for my $third (0.. $#candidates) { next if $third == $second; next if grep { (grep {$_ == $first} @$_ ) and (grep {$_ == $second} @$_) and (grep {$_ == $third} @$_) } @used; my $sign = join "", sort split //, $candidates[$first] .$candidates[$second].$candidates[$third]; if ($sign eq $signature) { print " [" . $words{$candidates[$first]}. "] (" . $words{$candidates[$second]}. ") <" . $words{$candidates[$third]}. "> \n"; push @used, [$first, $second,$third]; next; } } } } }
Of course, this program is going to be much slower than the normal anagram script. For 2000 candidate words -- i.e. the words found to be composed of input-phrase letters only --, it could run (depending on your conputer's speed) for 10 minutes! That's why I put some safeguards. If the candidates are more than 2000 it won't start at all. If they are more than 100, it will limit to 2 passes. The number of iterations rises very fast, and therefore, unless you have a FAST computer with LOTS of memory (and time and patience on your side), don't change these defaults. Three passes a list of 100 candidate words result in 1 million iterations, 8 million for 200, and 27 million for 300
This is just a shot. I am sure that there is room for improvements. Maybe some saints in the Monastery could help ... :-)
Enjoy!
 _  _ _  _  
(_|| | |(_|><
 _|   

In reply to Multiple words anagrams (challenge) Re: Perl's pearls by gmax
in thread Perl's pearls by gmax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-03-29 12:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found