comment on

Although the previous answers have explained the technique, I think that there are some pitfalls that a full example will show better.

The example below works fine, provided that the sentences are not too many (i.e., if you have enough memory to take them all into memory).

#!/usr/bin/perl -w
use strict;

open WORDS, "< words" or die "cant' open words";
my (@words, @sentences);
while (<WORDS>) {
    chomp;
    push @words, $_;
}
close WORDS;
open SENTENCES, "< sentences" or die "cant' open sentences";
push @sentences, $_ while <SENTENCES>;
close SENTENCES;

for my $word (@words) {
    my @found = grep /\b$word\b/, @sentences;
    if (@found) {
        print $word, ": \n\t", join "\t", @found;
    }
}
__END__
contents of file "words"
-------------------------------
first
second
third
fourth
-------------------------------

contents of file "sentences"
-------------------------------
I am the first
I always wanted to be the first
I never liked to be second
I second your request
I will never appear in the output
Better second than third
-------------------------------

program's output
-------------------------------
first:
        I am the first
        I always wanted to be the first
second:
        I never liked to be second
        I second your request
        Better second than third
third:
        Better second than third
[download]

In this example, I have "slurped" into memory all the words and all the sentences. This is due to the requirements that the matching sentences should be shown for each word, and that each sentence could belong to more than one word.
I have the feeling that in a real life situation you could not afford the "slurp" luxury. If this is the case, then you need either a database engine or an algorithm that will read the words first, then store the matching lines as file addresses into a hash, and finally for each word retrieve the matching lines using the stored addresses.

Notice that the if you want to show the results in the opposite way (for each sentence, which words it matches), then you can read all the words (which presumably should fit in memory), do the matching for each sentence you read and print the results immediately.

#!/usr/bin/perl -w
use strict;

open WORDS, "< words" or die "cant' open words";
my (@words, @sentences);
while (<WORDS>) {
    chomp;
    push @words, [$_, qr/\b$_\b/];
}
close WORDS;
open SENTENCES, "< sentences" or die "cant' open sentences";
while (<SENTENCES>) {
    my $printed = 0;
    for my $word (@words) {
        if (/$word->[1]/) {
            print $_ unless $printed++;
            print "\t", $word->[0];
        }
    }
    print "\n" if $printed;
}
close SENTENCES;
__END__

program's output:
-------------------------------
I am the first
        first
I always wanted to be the first
        first
I never liked to be second
        second
I second your request
        second
Better second than third
        second  third
[download]

In this second script, as an additional measure, I coded the words with the qr operator, which compiles them as regular expressions. So the program will run much faster, since the regex for each word is compiled only once.

Hope these examples give you the elements to solve your problem.

 _  _ _  _  
(_|| | |(_|><
 _|

In reply to Re: GREP/Regex - Locating Words in Sentences by gmax
in thread GREP/Regex - Locating Words in Sentences by snowy

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.