chinamox has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Brothers(and Sisters),

I am trying to set up a little application to search a file and return only the words that have the letters "a b o r t" in alphabetical order. I think I have managed to use .*? to get rid of any characters between these letters but now I am returning words that don't even have all of the letters in them. I know that the fix is likely very short but this is only my second week working in Perl, and I haven't been able to suss it out so far.

#!/usr/local/bin/perl -w use strict; # seed files for <> operator @ARGV = qw( /usr/dict/words ); while (<>) { # look for b e n(regardless of intervening characters) print if /a.*?b.*?o.*?r.*?t/; };
Thank you for any help,

Mox

Replies are listed 'Best First'.
Re: Regular expression questions
by imp (Priest) on Oct 07, 2006 at 14:33 UTC
    Your regex looks reasonable to me, so long as all the words in the dictionary are lowercase. Running it on my local dictionary file gives appropriate results. Could you provide an example of something that was matched and shouldn't have been?

    By the way, I find it easier to debug regular expressions when using the 'x' modifier, as follows:

    print if m{ a.*? b.*? o.*? r.*? t }x;
    </code>
      Thanks for the formating advice, I used it and found my problem (I forgot a single "." in my orginal!) *head desk*
Re: Regular expression questions
by prasadbabu (Prior) on Oct 07, 2006 at 14:30 UTC

    If I understood your question, this ll finish your job. But I think this is not efficient way, this can be still done with simple logic.

    use strict; use warnings; while (<DATA>) { chomp; my $dic = $_; my (@arr) = $dic =~ /([abort])/g; #get the matching letters + my %unique; my $str = join "", grep (!$unique{$_}++, @arr); #make the gr +epped letters unique print $dic, "\n" if ($str eq 'abort'); #print if the order is + a, b, o, r, t }; prints: ------- abort abortion __DATA__ abroad abort abortion boat boaring boart

    As imp said, when i tested your code it works fine for me as well.

    Prasad

      Thanks, this makes great sense. I will try it ASAP.
Re: Regular expression questions
by jkeenan1 (Deacon) on Oct 07, 2006 at 14:43 UTC
    1. What does this comment mean?

    # look for b e n(regardless of intervening characters)

    2. Keep in mind that what you're saying is: Process each file found in @ARGV. For each such file, analyze each line in turn and print out the entire line if the line as a whole matches the pattern.

    Example: Create a file called abort and add it to @ARGV. Type one line of text into that file:

    zzzzzaxbxoxrxt caxton
    Re-run the script and ask yourself if that's what you were looking for. HTH!
    Jim Keenan