gnu@perl has asked for the wisdom of the Perl Monks concerning the following question:

trying to match words from a list of alternates. The problem is that I cannot seem to stop the match at the root word.

ex:

$words = "(directory|file|age|action)"; while (<>){ if ( $_ !~ /$words/ ){ print "Could not find $_ in \$words\n"; } }

This works for the most part. If I enter a word that is not in the $word var the block executes. If I enter in 'directorys' the block executes, but if I enter in 'directoryy' the block does not execute.

I think that this is due to the fact that perl is greedy by default on matches. To fix this I tried changing the match to:

if ( $_ !~ /$words?/ ){

This did not work, as a matter of fact, it made it so that the code block never executed. Any help is appreciated.

TIA, Chad.

Replies are listed 'Best First'.
Re: regex question
by Aristotle (Chancellor) on Sep 24, 2002 at 17:22 UTC
    Are you sure you're testing correctly?
    $ perl -le'$_="directoryy"; print if /(directory|file|age|action)/' directoryy $ perl -le'$_="directorys"; print if /(directory|file|age|action)/' directorys
    Apparently the problem is somewhere else..

    Makeshifts last the longest.

      Thanks for pointing it out. I did mis-state my problem. I only want to match if $_ is an exact match for the alternate. I only want to match on 'directory', not 'directorys' or 'directoryy' or 'cheeze whiz'.
        Use \b to assert a word boundary then. my $word = qr/\b(directory|file|age|action)\b/; Or ^ and $ to assert string start/end (respectively). my $word = qr/^(directory|file|age|action)$/; Or any combination of the two that you like. See perldoc perlre.

        Makeshifts last the longest.

Re: regex question
by chromatic (Archbishop) on Sep 24, 2002 at 18:56 UTC

    Based on the specification I gather, I wouldn't use a regex at all:

    my %words = map { $_ => 1 } qw( directory file age action ); while (<>) { print "Could not find '$_' in list\n" unless exists $words{ $_ }; }
Re: regex question
by VSarkiss (Monsignor) on Sep 24, 2002 at 17:48 UTC

    Based on the correction you've described above, what you're really looking for is way to say "match the whole word". You can do that by specifying boundaries (note, this is untested):

    $words = "\b(directory|file|age|action)\b"; # A little more Perl-ish: while (<>) { print "Couldn't find $_\n" unless /$words/; }
    You can read more about \b in perlre.

      Hey, I like that. Using /^$words$/ works, but this is more precise. I'll try it out. Hang on......................yup, it works.

      Thanks again to all.

Re: regex question
by Jenda (Abbot) on Sep 24, 2002 at 17:28 UTC
    You must have mistyped something, try it again please!

    If you want to match only the exact words, not longer strings containing them use

    $_ =~ /^$words$/

    The ^ states that the word must be found at the very beginning of the line, $ means the very end.

    Jenda

      though don't forget that
      /^$words$/
      expands to
      /^word1|word2|word3$/
      What you probably want to do is
      /^(?:$words)$/

      -- Dan

        This will shoot in a little different direction, but I was wondering about the '?:'. dosen't that just keep the match from being placed into $_?
      DOH! Thanks. I tried that b4 and it didn't work, but I just realized what I had wrong. In my 'playing around' with things I had changed /$words/ to /^$words/ then to /^^$words$/ not thinking that when the $words alternate list is placed inside [] the '|' is taken literaly and not as a seperator for the alternates.

      I just put it as /^$words$/ and things seem to be working fine.

      Thanks for your help and pointing out my 'lil mistke.

Re: regex question
by BrowserUk (Patriarch) on Sep 24, 2002 at 17:40 UTC

    I Think you need if ( !/$words/ ) {...} rather than if ( $_ !~ /$words/ ) {...}

    UpdateYep! I completely misread that one:(


    Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring!