JungleBoy has asked for the wisdom of the Perl Monks concerning the following question:

Over the weekend I participated in a puzzle hunt, and one of the tools we kept on using was an anagram solver. Since I already had a nice large wordlist saved away, I figured I could whip up a script to do this. Unfortunately my regex skills were not quite up to the task, so I only ended up with a partial solution. I briefly asked on the CB what I could do, but was told it was impossible to do with a regex. I came so close though, that I figure there's gotta be a way to do this.

Here's what I was able to come up with. It'll correctly find anagrams, but it will also come up with a lot of false positives. It'll match any words of the right length, and with the right letters, but the number of letters won't be right. For example, it'll incorrectly match abcde to aaaaa.

open (INFILE, '<wordlist'); my $word = lc($ARGV[0]); my $size = length($word); while ($foo = <INFILE>) { if (lc($foo) =~ /^[$word]{$size}$/o) { print "Match: " . $foo . "\n"; } }
Anyone got a simple and correct solution?

Replies are listed 'Best First'.
Re: Anagram matching
by Fastolfe (Vicar) on Nov 06, 2001 at 03:29 UTC
    A regex implementation:
    sub is_anagram { my ($word, $test) = @_; foreach my $letter (split(//, $word)) { return unless $test =~ s/$letter//; } return if $test; # leftovers return 1; }

    A regex isn't the most efficient method for approaching this (and this example is probably the least efficient of all), but it's easy to understand. Many of the optimizations people made in Difference Of Two Strings can probably be applied here.

Re: Anagram matching
by Masem (Monsignor) on Nov 06, 2001 at 03:29 UTC
    my $base = join '', sort split //, lc $word; while ( my $foo = <INFILE> ) { print $foo if ( join '', sort split //, lc $foo eq $base ); }
    update Ok, so this doesn't even try to use regexes, but as others pointed out, regex matching of anagrams is neigh impossible.

    -----------------------------------------------------
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
    "I can see my house from here!"
    It's not what you know, but knowing how to find it if you don't know that's important

Re: Anagram matching
by merlyn (Sage) on Nov 06, 2001 at 03:19 UTC
      I've seen that question, and I even posted a reply to it. I feel that my question here is a little more simplified. I'm looking for a single regular expression that can do this match without having to split and re-assemble everything. It seems like it should be possible, but am I incorrect in that belief?
        It seems like it should be possible, but am I incorrect in that belief?
        My gut feeling that requiring a solution to be in a single regular expression will necessarily be both more complicated and slower.

        -- Randal L. Schwartz, Perl hacker