Two files comparasion

ganilmohan has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Two files comparasion by bart (Canon) on Aug 01, 2008 at 09:32 UTC
To answer your immediate question, instead of trying to guess what you really want to do: no, it's not quite correct. When you use == on arrays, you are comparing them in scalar context, and thus, you're numerically comparing the number of entries in each array. So yes, it'll return true if the files are the same, but not only then! All that is needed is that they contain the same number of lines. Since perl 5.10.x, Perl has a new operator, the smart match operator, `~~` (see this tutorial), which might actually work in your case, and compare line by line. I don't know, I've not used it much, yet. Looking at that tutorial, it looks like it ought to. There must be other solutions, for if you need it to work on an older perl. For example, take a look at how some Test modules do it, for example, Test::Deep, where you can simply compare arrays for equality, with the function `cmp_deeply`. A really simple solution is to load the whole files into two scalars, instead of into arrays, and compare them as strings. Just set $/ to undef and you read the whole file as one line. `local $/; #sets to undef for the current scope open(DAT_Source, $Source_data_file) \|\| die("Could not open file!"); $raw_data_source=<DAT_Source>; open(DAT_Expected, $Expected_data_file) \|\| die("Could not open file!") +; $raw_data_Expected=<DAT_Expected>; if ($raw_data_source eq $raw_data_Expected) { print "DATA MATCHED!"; } else { print "DATA NOT MATCHED!"; }` [download] That really requires very little change in your code. That's one of the things I really love about Perl: you can often completely change how a piece of code works, by just changing a few thingies here and there.	[reply] [d/l] [select]
Re^2: Two files comparasion by massa (Hermit) on Aug 01, 2008 at 10:03 UTC
changing even less, he could have changed `if( @raw_data_source == @raw_data_Expected )` [download] for `if( "@raw_data_source" eq "@raw_data_Expected" )` [download] and get the (presumably) desired comparison :-) Please, `use strict; use warnings;` !!! :-) []s, HTH, Massa (κς,πμ,πλ)	[reply] [d/l] [select]
Re^3: Two files comparasion by moritz (Cardinal) on Aug 01, 2008 at 10:11 UTC
`if( "@raw_data_source" eq "@raw_data_Expected" )` That does work, but if `$" eq $/` there could be cases where you get false positives. Luckily that's the default, but I feel it's worth to mention nonetheless. Update: Uhm, not entirely sure. But there's no need to split the lines into arrays unless you process it line by line.	[reply] [d/l] [select]
Re: Two files comparasion by Corion (Patriarch) on Aug 01, 2008 at 08:24 UTC
What did you try? What problems did you encounter? Does Perl return any errors? Maybe you want to use Algorithm::Diff?	[reply]
Re: Two files comparasion by dHarry (Abbot) on Aug 01, 2008 at 08:35 UTC
As usual it depends on what * exactly * you want to do. For example what means equal, i.e. how do you compare. Do you need an exact match or can you ignore whitespace. etc. On Unix/Linux you could use the diff command:-) But based on your files names I guess you run Windows? You might want to give Comparing arrays with text contents a look too. Hope this helps.	[reply]
Re: Two files comparasion by Anonymous Monk on Aug 01, 2008 at 08:34 UTC
perldoc File::Compare	[reply]
Re: Two files comparasion by moritz (Cardinal) on Aug 01, 2008 at 08:42 UTC
See Is it correct?.	[reply]
Re: Two files comparison by swampyankee (Parson) on Aug 01, 2008 at 17:13 UTC
If you're checking to see if two files match exactly, there's no need to read either into memory; you can use something like Digest::file, and compare the check sums for equality¹. In the (rather likely) event that you need to determine the actual differences between the files (as in diff or even Windows' fc, your first route should be to use a module like File::Compare. ¹ Some of the checksum algorithms will give false positives, in that identical checksums may be produced by different files. I believe "Algorithm 1" used by sum may be especially prone to this. Information about American English usage here and here. Floating point issues? Please read this before posting. — emc	[reply]