in reply to searching for keywords

The syntax error is because you don't have a block for your 'if' statement, only for the 'while'. You could get by without the block if you turned it into a modifier (see perlsyn), as follows:

print $_ if( $_ =~ m/$keyword/ );

There's more to it than that, though, because you're using the assignment operator ('=', see perlop) in what appears to be a pattern match, hence the modification I made to your code in the above example. If @keywords is large, use Super Search to find examples of using an array in a pattern match. If it is reasonably small, you can use alternation (perlre):

my $pattern = join( '|', @keywords ); if( $_ =~ m/$pattern/ )

Oh, and in addition to using strict and warnings, you may also find diagnostics helpful.

HTH

ikegami is absolutely right about escaping special characters, of course (see quotemeta). Thanks and ++ for pointing that out!

Update: full code example, with escaped characters

use strict; use warnings; my @keywords = qw( keyword1 keyword2 keyword3 ); my $pattern = join( '|', map quotemeta, @keywords ); print "pattern = [$pattern]\n"; # using 'if' as a modifier while( my $line = <DATA> ) { print $line if ( $line =~ m/$pattern/ ); } # using an 'if' block while( my $line = <DATA> ) { if( $line =~ m/$pattern/ ) { print $line; } } __DATA__ somelines... somelines... somewords...keyword1..somewords somelines... somewords... keyword2...somewords...

Both examples print:

pattern = [keyword1|keyword2|keyword3] somewords...keyword1..somewords keyword2...somewords...

Replies are listed 'Best First'.
Re^2: searching for keywords
by ikegami (Patriarch) on Jan 18, 2006 at 03:09 UTC

    m/$keyword/
    is wrong. You need to escape special characters. A simple way of doing this is
    m/\Q$keyword/

    my $pattern = join('|', @keywords );
    has the same problem. Use
    my $pattern = join('|', map quotemeta, @keywords);
    instead.

    If the list of words is long, you can speed things up a lot by using Regexp::List:
    my $pattern = Regexp::List->new->list2re(@keywords);

    All together, we get:

    use Regexp::List (); my @keywords = ("keyword1", "keyword2", "keyword3"); my $pattern = Regexp::List->new->list2re(@keywords); #my $pattern = join('|', map quotemeta, @keywords); # Alternative while (<SEARCHFILE>) { if ($_ =~ $pattern) { # or just: if (/$pattern/) { print $_; # or just: print; } }

      Great suggestion on the Regexp::List module. I hadn't investigated it before. I'm impressed with how it optimizes the list to minimize costly alternation. Efficiency seems to have been one of the primary design philosophies.

      Does anyone know if there is a PPM3 build of it anywhere? I didn't find it on the ActiveState repositories. I would love to play with it.

      I toyed with another solution that turns the problem upside down by putting the keywords in a hash, pulling out individual words one by one from the file, and checking for the existance of a given word in the keyword hash. For large keyword lists it could prove more efficient than pure simple alternation since hash lookups occur in O(1) time:

      use strict; use warnings; my %keywords; @keywords{ 'keyword1', 'keyword2', 'keyword3' } = (); while( <DATA> ){ chomp; while( m/\b([\w'-]+)\b/g ) { print "'$_' contains keyword: $1\n" if exists $keywords{ $1 }; } } __DATA__ a line with keyword2 in it a line with keyword1 and keyword3. a line with no keywords. keyword1 can start a line too. and a line can end in keyword2

      Enjoy.


      Dave

        It's Pure Perl. Just unzip its lib/ into your site/lib/
        Nice code. I just want to add if
        while( m/\b([\w'-]+)\b/g ) {
        replace by
        while( m/\b([\w'-]+)\b/gi ) {
        Your program becomes a case independed.