Saved has asked for the wisdom of the Perl Monks concerning the following question:

We have an Application called 'Tripwire' to monitor Security Compliance & Change. It was set up by another Deptmt. It is based on RegEx's, and my job is to ensure the RegEx's that are in it are working correctly. I've made progress, but have run into an issue I could use help with.

I have a script feed a RegEx from the command line which works.

My test passwd file (To test only root has UID=0)

$ cat passwd root:x:0:0:ifeuu1 root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin noncom:x:0:0:cause RegEx Fail:/noncom:/bin/bash daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin sync:x:5:0:sync:/sbin:/bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown

The Script

#!/usr/bin/perl # perl-grep4.pl my $pattern = shift @ARGV; my $regex = eval { qr/$pattern/ }; die "Check your pattern! $@" if $@; while( <> ) { if( m/$regex/ ) { print "$_"; print "\t\t\$&: ", substr( $_, $-[0], $+[0] - $-[0] ), "\n"; foreach my $i ( 1 .. $#- ) { print "\t\t\$$i: ", substr( $_, $-[$i], $+[$i] - $-[$i] ), "\n"; } } }

Usage & Output

$ Test5.pl '(?<!\broot:)x:0:' passwd noncom:x:0:0:cause RegEx Fail:/noncom:/bin/bash $&: x:0:

However, trying to get a script which is feed the RegEx from a file is failing

My RegEx-File

PW:root:passwd:(?<!\broot:)x:0:

The script

#!/usr/bin/perl #use strict; #use warnings; use Text::ParseWords; my ($RECORD, @FIELDS, $RE); my $DFile="/home/infosec/data/RegEx"; open (DATA, "<$DFile") || die ("Cannot open DATA file \n"); my @LINES=<DATA>; my $tLINES=@LINES; foreach $RECORD (@LINES[0..$tLINES-1]) { @FIELDS=split(/:/, "$RECORD"); my $TNAME=$FIELDS[0]; my $TVALU=$FIELDS[1]; my $TFILE=$FIELDS[2]; my $REGEX=$FIELDS[3]; chomp($REGEX); my $regex = eval { qr/$REGEX/ }; print "$regex \n"; open (TFILE, "<$TFILE") || die ("Cannot open Test file \n"); while (<TFILE>) { if( m/$regex/ ) { print "$_"; print "\t\t\$&: ", substr( $_, $-[0], $+[0] - $-[0] ), "\n"; foreach my $i ( 1 .. $#- ) { print "\t\t\$$i: ", substr( $_, $-[$i], $+[$i] - $-[$i] ), "\n"; } } } print "\n"; } print "\n";

The output

$ Test6.pl root:x:0:0:ifeuu1 root:/root:/bin/bash $&: bin:x:1:1:bin:/bin:/sbin/nologin $&: noncom:x:0:0:cause RegEx Fail:/noncom:/bin/bash $&: daemon:x:2:2:daemon:/sbin:/sbin/nologin $&: adm:x:3:4:adm:/var/adm:/sbin/nologin $&: lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin $&: sync:x:5:0:sync:/sbin:/bin/sync $&: shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown $&:

This matches all lines instead of only the third, and I had trouble getting strict & warning to run clean. Could anyone help?

Replies are listed 'Best First'.
Re: Test RegEx
by JavaFan (Canon) on Apr 22, 2010 at 14:47 UTC
    You are splitting PW:root:passwd:(?<!\broot:)x:0: on a colon. Which means your regexp becomes (?<!\broot. Probably not what you want. In fact, that doesn't even compile, evident by the fact you print your compiled regex, but it doesn't show in the output.

    You might want to give your split a third argument.

Re: Test RegEx
by MidLifeXis (Monsignor) on Apr 22, 2010 at 15:01 UTC

    Update:Duh, the ':0:' is part of the regexp. Just add the third argument to split, as JavaFan stated above.

    @FIELDS=split(/:/, "$RECORD");

    You are splitting the text version of your regular expression on the ':' character, and since the regexp has that character in it, the split does the wrong thing.

    Perhaps something along the lines of:

    my ($TNAME, $TVALU, $TFILE, $REGEX, $REST); if ($RECORD =~ m/^([^:]+):([^:]+):([^:]+):(.+):([^:]+):$/) { $TNAME = $1; $TVALU = $2; $TFILE = $3; $REGEX = $4; $REST = $5;

    Basically, you want to remove everything that is not the regexp (which do not contain ':'), and everything that is left ((.+)) is the regular expression.

    Either that, or figure out how to escape the ':' character in the text file. If there are a variable number of fields in the configuration file, then escaping the character might be a better solution.

    Changing the number of fields in the configuration file will require alteration of the parsing code. Escaping the ':' character could provide a more resilient solution against added fields.

    On a side note, the chomp($REGEX) is useless, since it comes in from the middle of the line.

    On another side note, quoting the $RECORD variable in the split statement is a useless operation.

    My wife is walking for a cure for MS. Please consider supporting her.

Re: Test RegEx
by Marshall (Canon) on Apr 22, 2010 at 22:55 UTC
    I was trying to understand your code and I just got lost!
    Can you provide just 2 input files and desired output?

    Here is where I got to before being completely lost.

    #!/usr/bin/perl use strict; # turned on strict and warnings. use warnings; # a "#" in front makes them comments my $DFile="/home/infosec/data/RegEx"; open (DATAF, "<$DFile") || die ("Cannot open DATAF file \n"); # don't use DATA as a file handle, that is a reserved word for Perl. # no need to "eval" $regex foreach my $line (<DATAF>) { chomp $line; my ($tname, tvalu, $tfile, $regex) = split(/:/, $line); open (TFILE, "<$tname") || die ("Cannot open Test file $tname\n"); while (<TFILE>) { next unless /$regex/; print $_; # this is so obtuse that my brain hurts! # These super tricky Perl Special Variables like # $& $- $+ are EXTREMELY seldom needed. Get rid of them. # please explain what this is supposed to do.... # $- is a scalar, not an array, what do you intend by $-[0]? # # print "\t\t\$&: ", # substr( $_, $-[0], $+[0] - $-[0] ), # "\n"; # foreach my $i ( 1 .. $#- ) # { # print "\t\t\$$i: ", # substr( $_, $-[$i], $+[$i] - $-[$i] ), # "\n"; # } } print "\n"; } print "\n";

      I, too, got lost and gave up pretty quickly (although I think JavaFan has set the OPer on the right path), but  @- and  @+ are regex-related special variables, so things like  $-[0] and  $+[0] are valid.

      Another oddity of the OPer's original code is

      my @LINES=<DATA>; my $tLINES=@LINES; foreach $RECORD (@LINES[0..$tLINES-1]) { ... }

      This seems to boil down to (assuming you don't just use a  while loop)

      my @LINES = <PROPERFILEHANDLE>; foreach $RECORD (@LINES) { ... }