comment on

This program, inspired by OT: How to find anagrams?, computes multi-word anagrams of a word or phrase. The program works by first creating a pool of possible words that could be made from the letters of the phrase. Then it enumerates over all possible combinations of word lengths, which corresponds to all the integer partitions of the phrase length. For each partition, it then tests all possible words of those particular lengths to see if they form an anagram of the phrase. For example, the phrase 'scalene' produces 'cleanse' and 'lee scan', among others. Output is one anagram per line.

#!/usr/bin/perl

# usage: anagram  'my phrase'  dictionary_file

use warnings;
use strict;
use Algorithm::Loops qw/ NestedLoops /;

my $phrase = shift;
my $dict = shift;

# collapse all letters into a sorted sequence
(my $sorted = $phrase) =~ s/[\W]//g;
$sorted = join '', sort split //, $sorted;
my $rx = qr|^[$sorted]+$|i;
my $phrase_length = length $sorted;

open DICT, "<$dict" or die "Could not open $dict: $!\n";
my @list;
my $num_words;
while (my $word = <DICT>) {
   chomp $word;
   next if length $word > $phrase_length;
   next unless $word =~ $rx;
   next if length $word == 1 && $word ne 'a' && $word ne 'i';
   push @{ $list[length $word] }, $word;
   $num_words++;
}
close DICT;
print "Number of words in pool: $num_words\n";
my @size;
$size[$_] = @{ $list[ $_ ] } for 1..$phrase_length;

# Enumerate over all integer partitions
my @p;
part( 2*$phrase_length, $phrase_length, 0);

### Subroutines

sub part
{
   my ($n, $k, $t) = @_;
   $p[$t] = $k;
   check_part( $t) if $n == $k;
   for (my $j = $k<$n-$k ? $k : $n-$k; $j >= 1; $j--) {
      part( $n-$k, $j, $t+1);
   }
}

my %ana_seen;
sub check_part
{
   my $t = shift;

   # create iterator for all combos of words
   my @iter = map { [0..$size[$p[$_]]-1] } 1..$t;
   my $iter = NestedLoops( \@iter);
   while( my @i = $iter->() ) {
      my $formatted = join " ", sort map {$list[ $p[$_] ][ $i[$_-1] ]}
+ 1..$t;
      next if $ana_seen{$formatted};
      $ana_seen{$formatted} = 1;
      my $test = join "", split / /, $formatted;
      $test = join '', sort split //, $test;
      print "$formatted\n" if $test eq $sorted;
   }
}
[download]

Update: cleaned up check_part() - removed debugging, filtered dups.

-Mark

In reply to Generate Multi-word Anagrams by kvale

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.