in reply to word association problem

I suppose the first thing I'd try would be to see whether both of these word lists fit in memory at the same time, because that makes things really easy:
open(LIST, "wordlist.txt"); open(MSTR, "master.txt"); my @wordlist = map { chomp; $_ } <LIST>; my @master = map { chomp; $_ } <MSTR>; foreach my $word ( @master ) { print "$word:",join(",",grep(/$word/,@wordlist)),$/; }

update: oh yeah -- gotta use "chomp", not "chop".

Replies are listed 'Best First'.
Re: Re: word association problem
by Chmrr (Vicar) on Aug 06, 2002 at 03:58 UTC

    Looks like this will nearly do the trick, but not quite. The main problem with this is that, contrary to the details above, this will include "accumbering" in the "accumb:" line. Hence, we need to start from the longest master words, and remove words from the wordlist as they match. We can also get a slight speedup by using index instead of a regex.

    #!/usr/bin/perl -w use strict; open(LIST, "wordlist.txt") or die "$!"; my @wordlist = map { chomp; $_ } <LIST>; close LIST; open(MSTR, "master.txt") or die "$!"; my @master = sort {length $b <=> length $a} map { chomp; $_ } <MSTR>; close MSTR; foreach my $word ( @master ) { my @matches; for (@wordlist) { next unless defined $_ and index($_, $word) >= 0; push @matches, $_; $_ = undef; } $word = [$word, join(", ",@matches)]; } print map {"$_->[0]: $_->[1].\n"} sort {$a->[0] cmp $b->[0]} @master;

    perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'