Find anagrams

Replies are listed 'Best First'.
RE: Find anagrams by runrig (Abbot) on Sep 21, 2000 at 02:34 UTC
Note: Updated with jcwren's suggestion. Just felt like making it more compact: `use strict; my %words; chomp(my @words = map(lc, <DATA>)); @words{@words}=undef; my %anagrams; push @{$anagrams{join '', sort(split //, $_)}}, $_ for keys %words; @$_>1 && print join(",",@$_),"\n" for values %anagrams; __DATA__ and Stain satin Not Ton one` [download]	[reply] [d/l]
(jcwren) RE: (2) Find anagrams by jcwren (Prior) on Sep 21, 2000 at 04:52 UTC
Here's my stab at it. This fixes the CR/LF problem (that being no one should have to care what format a file was saved on what machine, be it Windows, *nix, or Mac), and handles multiple words per line. Oh yea, and no local variables declared! I have this gut feeling that it can be reduced further, but I can't find it. `#!/usr/local/bin/perl -w use strict; push @{$ARGV[0] {join '', sort split //}}, $_ for map {lc} map {split} + <DATA>; @$_>1 && print join (',', @$_), $/ for values %{$ARGV[0]}; __DATA__ and dna Stain satin Not in this life Ton one file` [download] --Chris e-mail jcwren	[reply] [d/l]
(jcwren) RE: Find anagrams by jcwren (Prior) on Sep 21, 2000 at 00:07 UTC
From dictionary.com: A word or phrase formed by reordering the letters of another word or phrase, such as satin to stain. In case you played with this, and couldn't make it work, you have to feed it the word list, complete with anagrams. I.e., you can't supply 'stain', and have it find 'satin'. Rather, both words have to be present, and each word must be on a separate line. It will also only print one of the anagrams, not all, so you have no way of referencing them. I.e., if you have 'satin', 'stain', and 'naits' (it's not a word...), only 'naits' will be printed. I'm not sure how exactly that's useful... Some sample input would have been nice. --Chris e-mail jcwren	[reply]
RE (tilly) 2: Find anagrams by tilly (Archbishop) on Sep 21, 2000 at 00:38 UTC
If it only prints one anagram then that is a bug in your version of Perl. It most certainly prints correctly on both Linux and Windows NT with Perl 5.005_03. As for input, throw your dictionary at it and see what you get...	[reply]
(jcwren) RE: (2): Find anagrams by jcwren (Prior) on Sep 21, 2000 at 01:00 UTC
Well, it appears there are two problems. One is that it's O/S intolerant, and doesn't handle files with CR/LF. The other is that if a word occurs twice (like 'for') it prints 'for for'. The first problem was caused by using NOTEPAD.EXE to save some text I cut and pasted from a website to test it. Nonetheless, example programs should have some sample input and output. After all, if it was a "bug in your version of Perl", how would I know what to expect if it worked? --Chris e-mail jcwren	[reply]
RE (tilly) 3: Find anagrams by tilly (Archbishop) on Sep 21, 2000 at 02:50 UTC
RE (tilly) 1: Find anagrams by tilly (Archbishop) on Sep 21, 2000 at 00:00 UTC
I didn't realize that a lot of people don't know what an anagram is. Two words are anagrams if the letters in the one are the letters in the other modulo rearrangement. For instance "tap" and "pat" are anagrams. This program finds all of them in the input, including duplicated words, and not worrying about case.	[reply]
RE: Find anagrams by acid06 (Friar) on Sep 21, 2000 at 05:40 UTC
Well, it worked perfectly here... I tested it with many wordlists and in all of them it worked just fine. Indeed there's the duplicated word problem, but i don't think that someone would build an wordlist with duplicated words	[reply]


Clear questions and runnable code get the best and fastest answer
	PerlMonks