Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
I am trying to extract common words from two files, which are to be specified on the command line. The print out is to include the list of common words and the count of how many common words were found. Non-characters must be removed. Here is what I have so far, which is not working....
Any help would be appreciated. Thank you.my $f1 = shift; my $f2 = shift; if (! defined($f1) or ! defined($f2)) { die "Need two text file names as arguments. \n"; } my %results; open my $file1, '<', $f1; while (my $line = <$file1>) { $line =~ s/[[:punct:]]//g; for my $word (split(/\s+/,$line)) { $word =~ s/[^A-Za-z0-9]//g; $results{lc $word} = 1; } } my @words2; my @storage; open my $file2, '<', $f2; while (my $line = <$file2>) { $line =~ s/[[:punct:]]/ /g; @words2 = grep { /\S/ } split(/ /,$line); for (my $i=0; $i<scalar @words2; $i++){ $words2[$i] = lc($words2[$i]); $words2[$i] =~ s/[^A-Za-z0-9]//g; push(@storage, $words2[$i]); if (grep {$_ eq $words2[$i]} @storage[0..$#storage-1]){ $results{$words2[$i]} = 1; }else{ $results{$words2[$i]}++; } } } my $counter = 0; foreach my $words (sort { $results{$b} <=> $results{$a} } keys %result +s) { if ($results{$words} > 1){ $counter = $counter+1; print $words, "\n\n" ; } } printf "Found %1.0f words in common\n", $counter;
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Extracting common words
by Athanasius (Archbishop) on Oct 22, 2015 at 03:41 UTC | |
by AppleFritter (Vicar) on Oct 22, 2015 at 09:34 UTC | |
Re: Extracting common words
by GrandFather (Saint) on Oct 22, 2015 at 04:03 UTC | |
Re: Extracting common words
by Anonymous Monk on Oct 22, 2015 at 05:15 UTC | |
by Anonymous Monk on Oct 22, 2015 at 16:22 UTC | |
Re: Extracting common words
by Anonymous Monk on Oct 22, 2015 at 02:05 UTC | |
Re: Extracting common words
by Old_Gray_Bear (Bishop) on Oct 22, 2015 at 22:41 UTC | |
by BrowserUk (Patriarch) on Oct 22, 2015 at 23:08 UTC | |
by Anonymous Monk on Oct 22, 2015 at 23:39 UTC |