lanier has asked for the wisdom of the Perl Monks concerning the following question:

I have 2 files that I am trying to compare. One is sorted the other one is not(and includes some info that I really don't need) ie

//snippet of sorted for File1

acp.lanier.com = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (COMMUNITY = tcp.lanier.com) (PROTOCOL = TCP) (HOST = st.lanier.com) (PORT = 1533) ) ) (CONNECT_DATA = (SID = utl) ) )

//File 2 unsorted

/want to ignore this part

############## # Filename......: # Name..........: # Date..........: ################ /

sdcht.lanier.com = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (COMMUNITY = tcp.lanier.com) (PROTOCOL = TCP) (Host = sb.lanier.com) (Port = 1521) ) ) (CONNECT_DATA = (SID = sdcht) /want to ignorthis(GLOBAL_NAME = sdcht.lanier.com)/ ) )

I obtain the general pattern and input into an array. sort the unsorted array. and the compare the files and print what is different. If anyone can tell me where I am going wrong in my logic or coding.
#! /u/ss/bin/perl -w $pattern = qr/ ([^=]*=\s*\(DESCRIPTION\s*=\s* \(ADDRESS_LISTS\s*=\s*\(ADDRESS\s*=\s* \(COMMUNITY\s*=\s*[^)]*\)\s*\(PROTOCOL\s*=\s*TCP\)\s* \(HOSTS\s*=\s*[^)]*\)\s*\(PORTS\s*=\s*\d+\)\s*\)\s*\)\s* \(CONNECT_DATA\s*=\s*\(SID\s*=\s*[^)]*\)\s* [(GLOBAL_NAME\s*=\s*[^)]*)]?\s* \)\s*\)/x; open (FILE, "tnsnames.ora") or die $!; while (<>) { if(/$pattern/){ push @unsorted, $pattern; } } close FILE; @unsorted = sort @unsorted; open (FILE, "tns.log") or die $!; while (<>) { if(/$pattern/) { push @sorted, $pattern; } } close FILE; $x = pop @unsorted ||'' ; $y = pop @sorted ||'' ; while ($x || $y) { if($x gt $y) { print "missing from file1: $x \n"; $x = pop @sorted ||'' ; } elsif ($y gt $x) { print "missing from file2: $y \n"; $y = pop @sections ||'' ; } else{ $x = pop @sorted ||'' ; $y = pop @sections ||'' ; }

Replies are listed 'Best First'.
Re: Sort and Compare Files
by tilly (Archbishop) on May 14, 2003 at 17:48 UTC
    Why are you opening FILE and then reading from either STDIN or the files in @ARGV using the semi-magical <> construct?

    Reading from <FILE> would probably work better...

    (I didn't scan it for less glaring mistakes, though I would suggest that you do as perlstyle suggests and include the file name and operation in your error messages along with $!.)

Re: Sort and Compare Files
by Thelonius (Priest) on May 14, 2003 at 19:08 UTC
    If you "use strict", you will find that you have $y = pop @sections||''; where sections is never defined.

    Also, before the comparison loop, you have $x = pop @unsorted, but in the loop you have $x = pop @sorted

    Also, in the comparison loop you are replacing the larger value. When you do cosequential processing, you must replace the smaller value. Unless, of course, your input is sorted in descending order.

    When you replace the smaller value you are going to have problems with your use of '' to indicate end of list. Instead you may want to use something like "\xff" x 20.

    Updated: Oh, you are using pop instead of shift, so your lists are effectively sorted in descending order.

Re: Sort and Compare Files
by lanier (Initiate) on May 19, 2003 at 15:42 UTC
    Here is my updated code.
    #! /u/ss/bin/perl -w $pattern=qr/ ( ^[^=|#]*=\s*\(DESCRIPTION\s*=\s* \(ADDRESS_LIST\s*=\s*\(ADDRESS\s*=\s* \(COMMUNITY\s*=\s*[^)]*\)\s*\(PROTOCOL\s*=\s*TCP\)\s* \(HOST\s*=\s*[^)]*\)\s*\(PORT\s*=\s*\d+\)\s*\)\s*\)\s* \(CONNECT_DATA\s*=\s*\(SID\s*=\s*[^)]*\)\s*\)\s* \)\s* ) /x; open FILE,"tnsnames.ora" or die $!; {local $/=undef; $_ = <FILE>; } close FILE; print "This is tnsnames.ora\n"; @sections=/$pattern/g; foreach $el (@sections) { print $el; } @sorted=sort @sections; open FILE,"tns.log" or die $!; {local $/=undef; $_ = <FILE>; } close FILE; @sections=/$pattern/g; print "This is tns.log\n"; foreach $el (@sections) { print $el; } $x=pop @sorted || ''; print "This is $x\n"; $y=pop @sections || ''; print "This is y: $y\n"; while( $x || $y ){ if( $x gt $y ){ #print "missing from file1: $x\n"; $x = pop @sorted || ''; }elsif( $y gt $x ){ #print "missing from file2: $y\n"; $y = pop @sections || ''; }else{ $x = pop @sorted || ''; $y = pop @sections || ''; } }
    When I first try to print the @sections array for the first file there is nothing. when i print it for the 2nd file i get output. I am wondering if my pattern not matching the 1st file but somehow matching the 2nd file.

    example for first file: ############### # Filename......: tnsnames.ora # Name..........: LOCAL_REGION.lanier.com # Date..........: 24-OCT-95 14:30:07 ################ ##### TF #####Oracle Application File Server FNDFS_tf.lanier.com = (DESCRIPTION = (ADDRESS = (COMMUNITY = tcp.lanier.com) (PROTOCOL = tcp ) (HOST = tf.lanier.com) (PORT = 1521) ) (CONNECT_DATA = ( SID = FNDFS ) ) ) sdbet.lanier.com = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (COMMUNITY = tcp.lanier.com) (PROTOCOL = TCP) (Host = sb.lanier.com) (Port = 1521) ) ) (CONNECT_DATA = (SID = sdbet) (GLOBAL_NAME = sdbet.lanier.com) ) )

    I am trying to ignore the lines that start with # in my pattern

    example of 2nd file: acp.lanier.com = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (COMMUNITY = tcp.lanier.com) (PROTOCOL = TCP) (HOST = st.lanier.com) (PORT = 1533) ) ) (CONNECT_DATA = (SID = utl) ) ) acpd.lanier.com = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (COMMUNITY = tcp.lanier.com) (PROTOCOL = TCP) (HOST = sp.lanier.com) (PORT = 1531) ) ) (CONNECT_DATA = (SID = utld) ) )