Matching words based on letter content

knirirr has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Matching words based on letter content by friedo (Prior) on Jan 28, 2005 at 14:55 UTC
What I would do is take each word, split up the letters, and sort them alphabetically. Then you have a hash key which will be the same for "top", "pot", and so on. Here is an example. `use strict; use Data::Dumper; my @words = qw/opt top pot pit tip/; my %count; foreach my $w(@words) { my $key = join '', sort split '', $w; $count{$key}++; } print Dumper \%count;` [download] Output: `$VAR1 = { 'opt' => 3, 'ipt' => 2 };` [download] Update: Added the D::D output.	[reply] [d/l] [select]
Re^2: Matching words based on letter content by knirirr (Scribe) on Jan 28, 2005 at 15:22 UTC
What I would do is take each word, split up the letters, and sort them alphabetically. Then you have a hash key which will be the same for "top", "pot", and so on. Here is an example. D'oh! It is rather simple when you think of it that way - thanks.	[reply]
Re: Matching words based on letter content by holli (Abbot) on Jan 28, 2005 at 14:55 UTC
`use strict; my %h; @_ = qw (opt pot top pit ipt); for ( @_ ) { $h{join "", sort split "", $_}++; } for ( keys %h ) { print "$_ counted $h{$_} times\n"; }` [download] holli, regexed monk	[reply] [d/l]
Re: Matching words based on letter content by Anonymous Monk on Jan 28, 2005 at 15:15 UTC
`#!/usr/bin/perl use strict; use warnings; my ($w, @w, %w); while (<DATA>) { chomp; @w = (0) x 26; $w[$_]++ for map -0x61 + ord lc, /[a-z]/ig; push @{$w{"@w"}}, $_; } print "@$w\n" while (undef, $w) = each %w; __DATA__ opt top pot pit stoop topos pit opt top pot stoop topos` [download]	[reply] [d/l]
Re^2: Matching words based on letter content by wazoox (Prior) on Jan 28, 2005 at 16:54 UTC
mmmh, won't work with Unicode ;=)	[reply]
Re: Matching words based on letter content by dragonchild (Archbishop) on Jan 28, 2005 at 15:00 UTC
This is anagrams done sideways. Sounds like homework to me. Something to consider - are "stoop" and "stop" considered the same? If they are, then the solutions by holli and friedo won't work. Being right, does not endow the right to be rude; politeness costs nothing. Being unknowing, is not the same as being stupid. Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence. Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.	[reply]
Re^2: Matching words based on letter content by holli (Abbot) on Jan 28, 2005 at 15:15 UTC
`@_ = qw (opt pot top stoop stop pit ipt);` [download] it will print `opst counted 1 times oopst counted 1 times opt counted 3 times ipt counted 2 times` [download] Update: you´re right, i misread that. But this does it: `use strict; my %h; @_ = qw (opt pot top stoop stop pit ipt); for ( @_ ) { my $last; $h{join "", grep { if ( $_ eq $last ) { "" } else { $last = $_; $_ } } sort split "", $_}++; } for ( keys %h ) { print "$_ counted $h{$_} times\n"; }` [download] prints: `opst counted 2 times opt counted 3 times ipt counted 2 times` [download] holli, regexed monk	[reply] [d/l] [select]
Re^2: Matching words based on letter content by knirirr (Scribe) on Jan 28, 2005 at 15:17 UTC
I is most definitely not homework. It is in order to list total base count in a load of amino acids I've got - it's too long unless I compress them by content and forget about the order of bases within the string. 'Stoop' and 'stop' are therefore different.	[reply]