Although the previous answers have explained the technique, I think that there are some pitfalls that a full example will show better.

The example below works fine, provided that the sentences are not too many (i.e., if you have enough memory to take them all into memory).
#!/usr/bin/perl -w use strict; open WORDS, "< words" or die "cant' open words"; my (@words, @sentences); while (<WORDS>) { chomp; push @words, $_; } close WORDS; open SENTENCES, "< sentences" or die "cant' open sentences"; push @sentences, $_ while <SENTENCES>; close SENTENCES; for my $word (@words) { my @found = grep /\b$word\b/, @sentences; if (@found) { print $word, ": \n\t", join "\t", @found; } } __END__ contents of file "words" ------------------------------- first second third fourth ------------------------------- contents of file "sentences" ------------------------------- I am the first I always wanted to be the first I never liked to be second I second your request I will never appear in the output Better second than third ------------------------------- program's output ------------------------------- first: I am the first I always wanted to be the first second: I never liked to be second I second your request Better second than third third: Better second than third
In this example, I have "slurped" into memory all the words and all the sentences. This is due to the requirements that the matching sentences should be shown for each word, and that each sentence could belong to more than one word.
I have the feeling that in a real life situation you could not afford the "slurp" luxury. If this is the case, then you need either a database engine or an algorithm that will read the words first, then store the matching lines as file addresses into a hash, and finally for each word retrieve the matching lines using the stored addresses.

Notice that the if you want to show the results in the opposite way (for each sentence, which words it matches), then you can read all the words (which presumably should fit in memory), do the matching for each sentence you read and print the results immediately.
#!/usr/bin/perl -w use strict; open WORDS, "< words" or die "cant' open words"; my (@words, @sentences); while (<WORDS>) { chomp; push @words, [$_, qr/\b$_\b/]; } close WORDS; open SENTENCES, "< sentences" or die "cant' open sentences"; while (<SENTENCES>) { my $printed = 0; for my $word (@words) { if (/$word->[1]/) { print $_ unless $printed++; print "\t", $word->[0]; } } print "\n" if $printed; } close SENTENCES; __END__ program's output: ------------------------------- I am the first first I always wanted to be the first first I never liked to be second second I second your request second Better second than third second third
In this second script, as an additional measure, I coded the words with the qr operator, which compiles them as regular expressions. So the program will run much faster, since the regex for each word is compiled only once.

Hope these examples give you the elements to solve your problem.
 _  _ _  _  
(_|| | |(_|><
 _|   

In reply to Re: GREP/Regex - Locating Words in Sentences by gmax
in thread GREP/Regex - Locating Words in Sentences by snowy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.