I have to create a script for the file comparison. The file comparison has to be done by row by row and then column by column. This script will run for 350 set of files i.e. total of 700 files.There are around 50 files which contains more than 1 million records.The key in all the files is first field in each file.

Now i started the code with an approach that read first file and search for the key in second file. Once found and then do a column by column comparison and if not same log the output in new file which can then later be used to populate csv file.The code which i started is as follows

open(INFILE1,"File1.txt"); my @file1=<INFILE1>; close INFILE1; open(INFILE1,"File2.txt"); my @file2=<INFILE1>; while (<INFILE1>) { my @elements = split /\t/, $_; my $rowid = @elements[0]; my @filtered = grep /$rowid/, @file1; if ($#filtered ==0) { --I will write this in new file..this one's easy;} else { my $numelements=@elements; my $count=1; while ($count <= $numelements) { if (@filtered[$count] != @elements[$count]) { my $str="File 1 Value-".@filtered[$count]." File2 Value-".@ +elements[$count]; print $str; } $count=$count+1; } } }
Now this isn't working,I am not able to read array values (@filtered) which contains the row from second file after searching the row id. This part (@filtered$count != @elements$count) is not working. Is this ok. I tried using -ne also. Though the data contained in the files can be string, number or date. But I am assuming that since it is in text file, for my script it can be considered as text for comparison Can you please help me in identifying the issue.

Also, I read that hash will be faster to compare this huge set of data. Can someone help with hash approach. Can this be run for 700 odd files in loop using hash/array.


In reply to Column by Column Comparison by documents9900

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.