Match but ignore white space

blackbeard has asked for the wisdom of the Perl Monks concerning the following question:

I know that in s/// or m/// the /x will ignore white space

But here dealing with FILE1 a text file containing a cfg pushed to a network element and FILE2 a text file copy of the cfg on the same device is causing me difficulty. Basically, check that all lines in FILE1 are resident in FILE2.

The white space that the router throws in caused the code to report that a line isnt resident when in actually it is, but maybe with an extra space somewhere in the line.

So the question is, how do I turn off white space checking

I tried to

foreach $item (@xray){ 
$item =~ s/^ *//g;
$item =~ s/ *$//g;
$item =~ s/\s+/ /g;
push(@clean, $item);
}
[download]

but that didnt work because the lines being compared were still different ... humm just had a thought ... how about running through the array and just removing all white space and dump into @clean, then do the same to $line just prior to the grep ... then just a single string of characters ... right ?

open (FILE1, "$file1_val") or die;
open (FILE2, "$file2_val") or die;
@xray=<FILE2>;
    for $line (<FILE1>) {
     if (!(grep $line eq $_, @xray)) {
       print "$line\n"
     }
    }
close FILE1;
close FILE2;
}
[download]

Comment on Match but ignore white space Select or Download Code

Replies are listed 'Best First'.
Re: Match but ignore white space by pc88mxer (Vicar) on May 09, 2008 at 15:08 UTC
I know that in s/// or m/// the /x will ignore white space ... The `/x` modifier will allow you to add white space in your regular expression to make it more readable - it doesn't ignore white space in the text that you are trying to match. Example: `my $x = "foobar"; print "matches 1\n" if ($x =~ m/foo bar/x); # succeeds print "matches 2\n" if ($x =~ m/foo bar/); # fails` [download] humm just had a thought ... how about running through the array and just removing all white space and dump into @clean, then do the same to $line just prior to the grep This is a good idea. You are canonicalizing the input, mapping each line to a single representative so that you can just use string equality to test if the lines are "the same." Another common perl-ish approach is to use a hash. This will avoid iterating through the the `@clean` array for each line in the second file: `my %hash; while (<file1>) { my $clean = ... convert $_ into its canonical form ... $hash{$clean} = 1; } while (<file2>) { my $clean = ... convert $_ into its canonical form ... if ($hash{$clean}) { # line from file2 exists in file1 } else { # line from file2 is not in file1 } }` [download] This run time is essentially linear in the number of lines of each file.	[reply] [d/l] [select]
Re: Match but ignore white space by grizzley (Chaplain) on May 09, 2008 at 15:56 UTC
Maybe the lines are different? :) When removing whitespaces in for loop, try to change order of commands to following: `foreach $item (@xray) { $item =~ s/\s+/ /g; $item =~ s/^ //g; $item =~ s/ $//g; }` [download] PS. This would be much easier to read if you used <code> tags around Perl code (Writeup Formatting Tips).	[reply] [d/l]
Re: Match but ignore white space by jhourcle (Prior) on May 09, 2008 at 16:31 UTC
Basically, check that all lines in FILE1 are resident in FILE2 I don't know if I'd even use perl for the bulk of this -- `diff -b` will ignore whitespace. If the lines might appear in different order between the two files, sort them first. You can then use perl to process the output, so you can accept the case of line in FILE2 that aren't in FILE1.	[reply] [d/l]
Re^2: Match but ignore white space by pc88mxer (Vicar) on May 09, 2008 at 17:10 UTC
Unfortunately you'd have to first sort the files ignoring the white space in the special way you want to ignore the white space, and standard Unix `sort` only has the option to ignore leading blanks. Another Unix utility suitable for this type of task is `comm`, although it also requires input that already has been sorted.	[reply] [d/l] [select]