Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Find anagrams

by tilly (Archbishop)
on Sep 20, 2000 at 23:27 UTC ( [id://33383]=CUFP: print w/replies, xml ) Need Help??

Takes input in the form of a list with one word on a line then finds then prints all of the anagrams. I originally wrote this on a challenge from a co-worker. When I gave him a solution in under 5 minutes, he gained a lot of respect for Perl.
#! /usr/bin/perl -w use strict; my @words = <>; chomp(@words); &find_anagrams(@words); # Takes a list or words, finds all anagrams in it sub find_anagrams { my %anagrams; foreach my $word (@_) { my $key = join '', sort(split(//, lc($word))); push @{$anagrams{$key}}, $word; } foreach my $lst (values %anagrams) { if (1 < @$lst) { print "@$lst\n"; } } }

Replies are listed 'Best First'.
RE: Find anagrams
by runrig (Abbot) on Sep 21, 2000 at 02:34 UTC
    Note: Updated with jcwren's suggestion.

    Just felt like making it more compact:
    use strict; my %words; chomp(my @words = map(lc, <DATA>)); @words{@words}=undef; my %anagrams; push @{$anagrams{join '', sort(split //, $_)}}, $_ for keys %words; @$_>1 && print join(",",@$_),"\n" for values %anagrams; __DATA__ and Stain satin Not Ton one
      Here's my stab at it. This fixes the CR/LF problem (that being no one should have to care what format a file was saved on what machine, be it Windows, *nix, or Mac), and handles multiple words per line. Oh yea, and no local variables declared!

      I have this gut feeling that it can be reduced further, but I can't find it.
      #!/usr/local/bin/perl -w use strict; push @{$ARGV[0] {join '', sort split //}}, $_ for map {lc} map {split} + <DATA>; @$_>1 && print join (',', @$_), $/ for values %{$ARGV[0]}; __DATA__ and dna Stain satin Not in this life Ton one file
      --Chris

      e-mail jcwren
(jcwren) RE: Find anagrams
by jcwren (Prior) on Sep 21, 2000 at 00:07 UTC
    From dictionary.com: A word or phrase formed by reordering the letters of another word or phrase, such as satin to stain.

    In case you played with this, and couldn't make it work, you have to feed it the word list, complete with anagrams. I.e., you can't supply 'stain', and have it find 'satin'. Rather, both words have to be present, and each word must be on a separate line.

    It will also only print one of the anagrams, not all, so you have no way of referencing them. I.e., if you have 'satin', 'stain', and 'naits' (it's not a word...), only 'naits' will be printed. I'm not sure how exactly that's useful...

    Some sample input would have been nice.

    --Chris

    e-mail jcwren
      If it only prints one anagram then that is a bug in your version of Perl. It most certainly prints correctly on both Linux and Windows NT with Perl 5.005_03.

      As for input, throw your dictionary at it and see what you get...

        Well, it appears there are two problems. One is that it's O/S intolerant, and doesn't handle files with CR/LF. The other is that if a word occurs twice (like 'for') it prints 'for for'.

        The first problem was caused by using NOTEPAD.EXE to save some text I cut and pasted from a website to test it.

        Nonetheless, example programs *should* have some sample input and output. After all, if it was a "bug in your version of Perl", how would I know what to expect if it worked?

        --Chris

        e-mail jcwren
RE (tilly) 1: Find anagrams
by tilly (Archbishop) on Sep 21, 2000 at 00:00 UTC
    I didn't realize that a lot of people don't know what an anagram is.

    Two words are anagrams if the letters in the one are the letters in the other modulo rearrangement. For instance "tap" and "pat" are anagrams.

    This program finds all of them in the input, including duplicated words, and not worrying about case.

RE: Find anagrams
by acid06 (Friar) on Sep 21, 2000 at 05:40 UTC
    Well, it worked perfectly here... I tested it with many wordlists and in all of them it worked just fine. Indeed there's the duplicated word problem, but i don't think that someone would build an wordlist with duplicated words

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://33383]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2024-04-18 06:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found