documents9900 has asked for the wisdom of the Perl Monks concerning the following question:
I have to create a script for the file comparison. The file comparison has to be done by row by row and then column by column. This script will run for 350 set of files i.e. total of 700 files.There are around 50 files which contains more than 1 million records.The key in all the files is first field in each file.
Now i started the code with an approach that read first file and search for the key in second file. Once found and then do a column by column comparison and if not same log the output in new file which can then later be used to populate csv file.The code which i started is as follows
Now this isn't working,I am not able to read array values (@filtered) which contains the row from second file after searching the row id. This part (@filtered$count != @elements$count) is not working. Is this ok. I tried using -ne also. Though the data contained in the files can be string, number or date. But I am assuming that since it is in text file, for my script it can be considered as text for comparison Can you please help me in identifying the issue.open(INFILE1,"File1.txt"); my @file1=<INFILE1>; close INFILE1; open(INFILE1,"File2.txt"); my @file2=<INFILE1>; while (<INFILE1>) { my @elements = split /\t/, $_; my $rowid = @elements[0]; my @filtered = grep /$rowid/, @file1; if ($#filtered ==0) { --I will write this in new file..this one's easy;} else { my $numelements=@elements; my $count=1; while ($count <= $numelements) { if (@filtered[$count] != @elements[$count]) { my $str="File 1 Value-".@filtered[$count]." File2 Value-".@ +elements[$count]; print $str; } $count=$count+1; } } }
Also, I read that hash will be faster to compare this huge set of data. Can someone help with hash approach. Can this be run for 700 odd files in loop using hash/array.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Column by Column Comparison
by moritz (Cardinal) on Mar 31, 2013 at 13:53 UTC | |
|
Re: Column by Column Comparison
by poj (Abbot) on Mar 31, 2013 at 13:40 UTC | |
|
Re: Column by Column Comparison
by Anonymous Monk on Mar 31, 2013 at 22:46 UTC | |
|
Re: Column by Column Comparison
by hdb (Monsignor) on Apr 01, 2013 at 08:43 UTC |