Re^3: comparing any two text files and writing the difference to a third file

Ah, yes, reinventing the wheel is a classic learning exercise. Unfortunately, the diff algorithm isn't exactly a simple algorithm to re-invent, especially if you want to handle multi-line changes and the like. When learning a new language, I like to pick an algorithm that I know how to implement in some other language, and then re-implement it in the new language. For your first few attempts, it might not turn out very "perlish", but it gets you learning.

Once you have something implemented, you could make a post here like "while learning perl, I am trying to convert this algorithm I implemented in somenonperllanguage: <code> .... </code>, and I've successfully re-implemented it in perl here: <code>....</code>. Do you have any suggestions for how to make it more perlish?". Or, if you had problems getting it to work, show us the code you tried, and the expected output vs the output you got. (See also How to ask better questions using Test::More and sample data). Unfortunately, what you provided us was "here's the file-reading code I was able to figure out; now write my diff-algorithm for me", which is less likely to garner detailed answers; even saying "here is the algorithm I'd like to do (....), but I don't know how to implement it in perl" would have likely gotten more help.

Going back to your original post, commenting on the file access code you've written. First, use warnings; use strict;: this will help enforce things that will make your code better in the long run. open (FH1,$F1)||die "cannot open $F1.\n";: there are four things I would comment on here: 1) generally, modern perl uses or die "..." rather than || die "..." because of precedence issues (which will come up momentarily). 2) it's usually best to use the 3-argument form of open, which is open my $fh1, '<', $F1. (You don't need the parentheses here if you use the OR form of open my $fh1, '<', $F1 or die "...".) 3) You may have noticed I used my $fh1 instead of FH1: this gives it lexical scope (not cluttering the global namespace with FH1 filehandles), and gives the added benefit that when $fh1 drops out of scope, it will automatically close the file for you. 4) If you use autodie; when using modules at the beginning, you don't need the || die / or die construct at all.

Comment on Re^3: comparing any two text files and writing the difference to a third file Select or Download Code

Replies are listed 'Best First'.
Re^4: comparing any two text files and writing the difference to a third file by pryrt (Abbot) on Jan 23, 2019 at 15:35 UTC
Here is a simplistic algorithm that will just compare each line, one at a time, and show whether the individual lines match or not. use warnings; use strict; use autodie; use File::Compare; my $F1="version1.txt"; my $F2="count.txt"; my $F3="differ.txt"; ############################# #### adding these sections to create the files for me { open my $fh, '>', $F1; print {$fh} <<EOT; This is line one This is second line This is third line EOT } # automatically closes file when leaving scope { open my $fh, '>', $F2; print {$fh} <<EOT; This is line 1 This is secont line This is third line EOT } # automatically closes file when leaving scope ############################# my $cmp = compare($F1,$F2); # [pryrt]: if they're big, it's better to +compare once, rather than three times #the addition of the USE line is important for this function to wo +rk. if($cmp==0) {print"they are the same\n";} elsif($cmp==1) {print"they are different\n"; #opening/creating all three open my $fh1, '<', $F1; open my $fh2, '<', $F2; open my $fh3, '>', $F3; while (<$fh1>) { last if eof($fh2); my $content2=<$fh2>; #reads the complete file to content2. + # pryrt: add 'my' for lexical scope chomp($_, $content2); #haven't yet figured out how to pull out the difference data. #### [pryrt]: here's a simple comparison, just highlighting w +hich lines are differnt if($_ eq $content2) { # this line is the same printf $fh3 "%-20s= %s\n", 'MATCH', $_; } else { # this line is different printf $fh3 "%-20s> %s\n", 'DIFFERENT', ''; printf $fh3 "%20s< %s\n", $F1, $_; printf $fh3 "%20s> %s\n", $F2, $content2; } #### [/pryrt] } } elsif($cmp==-1) {print"error\n";} else {print"something wrong\n";} print"Please check the file $F3\n"; # pryrt: don't repeat yours +elf: use the filename variable ### pryrt: I want to display F3 without external help { print "\n===== $F3 =====\n"; open my $fh, '<', $F3; print while(<$fh>); # uses perlish postfix, and perlish default of + print using $_ print "\n===============\n"; } [download] with output: `they are different Please check the file differ.txt ===== differ.txt ===== DIFFERENT > version1.txt< This is line one count.txt> This is line 1 DIFFERENT > version1.txt< This is second line count.txt> This is secont line MATCH = This is third line ===============` [download] Oh, right, aside from the comments I made, I also added newlines to your print statements.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^4: comparing any two text files and writing the difference to a third file
by pryrt (Abbot) on Jan 23, 2019 at 15:35 UTC

Here is a simplistic algorithm that will just compare each line, one at a time, and show whether the individual lines match or not.

use warnings;
use strict;
use autodie;
use File::Compare;
my $F1="version1.txt";
my $F2="count.txt";
my $F3="differ.txt";

#############################
#### adding these sections to create the files for me
{
    open my $fh, '>', $F1;
    print {$fh} <<EOT;
This is line one
This is second line
This is third line
EOT
} # automatically closes file when leaving scope

{
    open my $fh, '>', $F2;
    print {$fh} <<EOT;
This is line 1
This is secont line
This is third line
EOT
} # automatically closes file when leaving scope
#############################

my $cmp = compare($F1,$F2); # [pryrt]: if they're big, it's better to 
+compare once, rather than three times
    #the addition of the USE line is important for this function to wo
+rk.
if($cmp==0)
    {print"they are the same\n";}
elsif($cmp==1)
    {print"they are different\n";
    #opening/creating all three
     open my $fh1, '<', $F1;
     open my $fh2, '<', $F2;
     open my $fh3, '>', $F3;
     while (<$fh1>)
     {
         last if eof($fh2);
         my $content2=<$fh2>; #reads the complete file to content2.   
+  # pryrt: add 'my' for lexical scope
         chomp($_, $content2);
         #haven't yet figured out how to pull out the difference data.
         #### [pryrt]: here's a simple comparison, just highlighting w
+hich lines are differnt
         if($_ eq $content2) {
            # this line is the same
            printf $fh3 "%-20s= %s\n", 'MATCH', $_;
         } else {
            # this line is different
            printf $fh3 "%-20s> %s\n", 'DIFFERENT', '';
            printf $fh3 "%20s< %s\n", $F1, $_;
            printf $fh3 "%20s> %s\n", $F2, $content2;
         }
         #### [/pryrt]
     }
}
elsif($cmp==-1)
    {print"error\n";}
else    {print"something wrong\n";}

print"Please check the file $F3\n";        # pryrt: don't repeat yours
+elf: use the filename variable

### pryrt: I want to display F3 without external help
{
    print "\n===== $F3 =====\n";
    open my $fh, '<', $F3;
    print while(<$fh>); # uses perlish postfix, and perlish default of
+ print using $_
    print "\n===============\n";
}
[download]

with output:

they are different
Please check the file differ.txt

===== differ.txt =====
DIFFERENT           > 
        version1.txt< This is line one
           count.txt> This is line 1
DIFFERENT           > 
        version1.txt< This is second line
           count.txt> This is secont line
MATCH               = This is third line

===============
[download]

Oh, right, aside from the comments I made, I also added newlines to your print statements.

[reply]
[d/l]
[select]