blackbeard has asked for the wisdom of the Perl Monks concerning the following question:

I know that in s/// or m/// the /x will ignore white space

But here dealing with FILE1 a text file containing a cfg pushed to a network element and FILE2 a text file copy of the cfg on the same device is causing me difficulty. Basically, check that all lines in FILE1 are resident in FILE2.

The white space that the router throws in caused the code to report that a line isnt resident when in actually it is, but maybe with an extra space somewhere in the line.

So the question is, how do I turn off white space checking

I tried to
foreach $item (@xray){ $item =~ s/^ *//g; $item =~ s/ *$//g; $item =~ s/\s+/ /g; push(@clean, $item); }
but that didnt work because the lines being compared were still different ... humm just had a thought ... how about running through the array and just removing all white space and dump into @clean, then do the same to $line just prior to the grep ... then just a single string of characters ... right ?
open (FILE1, "$file1_val") or die; open (FILE2, "$file2_val") or die; @xray=<FILE2>; for $line (<FILE1>) { if (!(grep $line eq $_, @xray)) { print "$line\n" } } close FILE1; close FILE2; }

Replies are listed 'Best First'.
Re: Match but ignore white space
by pc88mxer (Vicar) on May 09, 2008 at 15:08 UTC
    I know that in s/// or m/// the /x will ignore white space ...
    The /x modifier will allow you to add white space in your regular expression to make it more readable - it doesn't ignore white space in the text that you are trying to match. Example:
    my $x = "foobar"; print "matches 1\n" if ($x =~ m/foo bar/x); # succeeds print "matches 2\n" if ($x =~ m/foo bar/); # fails
    humm just had a thought ... how about running through the array and just removing all white space and dump into @clean, then do the same to $line just prior to the grep
    This is a good idea. You are canonicalizing the input, mapping each line to a single representative so that you can just use string equality to test if the lines are "the same."

    Another common perl-ish approach is to use a hash. This will avoid iterating through the the @clean array for each line in the second file:

    my %hash; while (<file1>) { my $clean = ... convert $_ into its canonical form ... $hash{$clean} = 1; } while (<file2>) { my $clean = ... convert $_ into its canonical form ... if ($hash{$clean}) { # line from file2 exists in file1 } else { # line from file2 is not in file1 } }
    This run time is essentially linear in the number of lines of each file.
Re: Match but ignore white space
by grizzley (Chaplain) on May 09, 2008 at 15:56 UTC

    Maybe the lines are different? :)

    When removing whitespaces in for loop, try to change order of commands to following:

    foreach $item (@xray) { $item =~ s/\s+/ /g; $item =~ s/^ *//g; $item =~ s/ *$//g; }
    PS. This would be much easier to read if you used <code> tags around Perl code (Writeup Formatting Tips).
Re: Match but ignore white space
by jhourcle (Prior) on May 09, 2008 at 16:31 UTC
    Basically, check that all lines in FILE1 are resident in FILE2

    I don't know if I'd even use perl for the bulk of this -- `diff -b` will ignore whitespace. If the lines might appear in different order between the two files, sort them first.

    You can then use perl to process the output, so you can accept the case of line in FILE2 that aren't in FILE1.

      Unfortunately you'd have to first sort the files ignoring the white space in the special way you want to ignore the white space, and standard Unix sort only has the option to ignore leading blanks.

      Another Unix utility suitable for this type of task is comm, although it also requires input that already has been sorted.