pchou has asked for the wisdom of the Perl Monks concerning the following question:

I'm so sorry about the previous post. I pressed "submit" instead of "preview". Here it is again, hopefully better formatted.

I'm trying to write an automated cryptogram solver using perl. I am limiting this program to merely solving simple substitution cipers. For those of you who don't know, here is an example of a substitution ciper:

cryptogram:
DVYXUEQVKO IEAJVR: WEETOKVT UMSR XKQJ ZEL! DEOJ WKDT EBUJZ!
solution:
CRYPTOGRAM LOVERS: BOOKMARK THIS PAGE NOW! COME BACK OFTEN!

Thre is a 1:1 correspondence between the letters for the entire cryptogram.

My idea is to attack it brute force using /usr/dict/words to get potential words for each word in the cryptogram, then get a partial cipher alphabet for each word, and finally recurse through all the partial ciphers for one that matches the most (or mismatches the least with the others). If any of you think this is idea could be improved please let me know.

Anyway, the first part is for me to write a program that, when given a word, will list all possibilities.

For example, "ABCC" will yield "ball", "less", "tree", etc. I have code that will do this, but there is a bug. For example, when I give it "ABCA", it should give me the word "that", but it should not give me the word "noon", but it does. Can someone help me find the bug?

#!/usr/bin/perl -w use strict; use vars '@dict'; open(DICT, '</usr/dict/words') || die "/usr/dict/words: $!\n"; chomp(@dict = <DICT>); close DICT; sub match_crypto { my @pattern = split //, $_[0]; my (@letters,%slots,@matches); WORDS: foreach my $word (@dict) { next unless length $word == @pattern; @letters = split //, $word; %slots = (); for (my $i = 0; $i < @letters; ++$i) { next WORDS if $letters[$i] eq $pattern[$i]; if (exists $slots{$pattern[$i]}) { next WORDS unless $slots{$pattern[$i]} eq $letters[$i]; } else { $slots{$pattern[$i]} = $letters[$i]; } } push @matches, $word; } @matches; }

Replies are listed 'Best First'.
Re: Cryptogram Solver
by I0 (Priest) on Mar 23, 2001 at 06:28 UTC
    sub pattern{ my $slot = 'a'; my %slots = (); my $pattern = ""; for( split//, $_[0] ){ $pattern .= $slots{$_} ||= $slot++; } $pattern; } sub match_crypto { my $pattern = pattern($_[0]); my $len = length $pattern; my @matches = grep{length == $len && pattern($_) eq $pattern} @dict; }
Re: Cryptogram Solver
by mr.nick (Chaplain) on Mar 23, 2001 at 05:14 UTC
    Interesting :) Without looking at your code I tried writing something to solve this myself. Strangely enough, we did it almost the exact same way.

    The function takes the obvious parameters to it, a pattern and the word.

    sub match { my $pattern=shift; my $word=shift; ## have to be exact return unless length($pattern) == length($word); my %seen; my @pattern=split //,$pattern; my @word=split //,$word; for my $let (@word) { my $pat=shift @pattern; ## if this is a new pattern, make sure it doesn't ## match a previous letter if (!defined $seen{$pat}) { ## bad if this letter exists return if grep /$let/,join("",values %seen); ## it's OK, $seen{$pat}=$let; } else { ## must be the same return if $seen{$pat} ne $let; } ## tis good } 1; }
    Mine does not exhibit the same problem as yours, I think because I scan the pattern against the word, not the word against the pattern. Therefore, the UNIQUENESS of the testing is dependant on the pattern, not the word.

    any senselessness you detect is due to medication


    Oh, duh. I didn't realize what this solution could be used for :) Implementing a brute force decrypter using this would be simplistic, eh?
      First of all, I'm not a perl monk, so my coding style isn't quite as pretty as yours. I have to stare at stuff like:

      return if grep /$let/,join("",values %seen);

      for about 3 minutes before I say "ok, I know what you're doing, but I didn't know you could do that..." :)

      I did find the bug in my code before I saw your solution, but yours is obviously more elegant. I just might steal your version. :)

      As for using this as a brute force decrypter -- yes, it is easy in theory. But it takes me about 4 hours to code up the thing you just saw above. Writing the entire thing is not a trivial task for me.

      Can you give me some tips on how I should approach the rest of the problem? So far, my approach has been pretty unelegant. What's the best way to recurse through all the possible words for each codeword? I have written a program that generates a partial cipher alphabet and compares two cipher alphabets to see how many entries mismatch. I'm concerned about doing the recursive part in a dumb way and eating up a ton of memory resources unnecessarily.

      Thanks again for all your help guys.

Re: Cryptogram Solver
by buckaduck (Chaplain) on Mar 23, 2001 at 05:19 UTC
    You need an additional test:
    if (exists $slots{$pattern[$i]}) { next WORDS unless $slots{$pattern[$i]} eq $letters[$i]; } else { $slots{$pattern[$i]} = $letters[$i]; } foreach $key (keys %slots) next if ($key eq $pattern[$i]); # next WORDS if ($slots{$key} eq $letters[$i]); # <- HERE } #
    Update: fixed a code error or two...
    Update2: ...or three. But it seems to work now.

    buckaduck

Re: Cryptogram Solver
by lhoward (Vicar) on Mar 23, 2001 at 05:14 UTC
    Here's some source code (in C++) that solves simple substitution cyphers. The algorithm should be easy enough to adapt to perl. solver.C
Re: Cryptogram Solver
by 2501 (Pilgrim) on Mar 26, 2001 at 06:30 UTC
    Merlyn Has a sweet one built already. It is listed on his home node. Look for 'pat'.
    It is alot of fun.
Re: Cryptogram Solver
by Anonymous Monk on Aug 12, 2011 at 10:03 UTC
    1 2 3 4 5 6 7 8 3 9 3 7 10 5 5 4 11 4 7 7 12 2 13 , 14 3 15 4 9 8 3 14 16 7 16 9 8 13 , 14 3 15 4 12 2 17 . 3 17 8 12 7 5 16 7 10 2 18 3 18 12 14 18 12 14 10 4 ; 2 12 17 8 4 2 3 17 3 7 16 5 4 16 1 17 8 16 7 4 17 8 3 5 19 7 17 8 12 17 19 3 18 4 18 12 14 10 4 17 16 7 10 2 18 3 18 12 14 .