In this example, I have "slurped" into memory all the words and all the sentences. This is due to the requirements that the matching sentences should be shown for each word, and that each sentence could belong to more than one word.#!/usr/bin/perl -w use strict; open WORDS, "< words" or die "cant' open words"; my (@words, @sentences); while (<WORDS>) { chomp; push @words, $_; } close WORDS; open SENTENCES, "< sentences" or die "cant' open sentences"; push @sentences, $_ while <SENTENCES>; close SENTENCES; for my $word (@words) { my @found = grep /\b$word\b/, @sentences; if (@found) { print $word, ": \n\t", join "\t", @found; } } __END__ contents of file "words" ------------------------------- first second third fourth ------------------------------- contents of file "sentences" ------------------------------- I am the first I always wanted to be the first I never liked to be second I second your request I will never appear in the output Better second than third ------------------------------- program's output ------------------------------- first: I am the first I always wanted to be the first second: I never liked to be second I second your request Better second than third third: Better second than third
In this second script, as an additional measure, I coded the words with the qr operator, which compiles them as regular expressions. So the program will run much faster, since the regex for each word is compiled only once.#!/usr/bin/perl -w use strict; open WORDS, "< words" or die "cant' open words"; my (@words, @sentences); while (<WORDS>) { chomp; push @words, [$_, qr/\b$_\b/]; } close WORDS; open SENTENCES, "< sentences" or die "cant' open sentences"; while (<SENTENCES>) { my $printed = 0; for my $word (@words) { if (/$word->[1]/) { print $_ unless $printed++; print "\t", $word->[0]; } } print "\n" if $printed; } close SENTENCES; __END__ program's output: ------------------------------- I am the first first I always wanted to be the first first I never liked to be second second I second your request second Better second than third second third
_ _ _ _ (_|| | |(_|>< _|
In reply to Re: GREP/Regex - Locating Words in Sentences
by gmax
in thread GREP/Regex - Locating Words in Sentences
by snowy
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |