nofernandes has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone!!

I want to compare two files in order to extract the line numbers of the lines where the two files match!!

But the problem is that for example:

File1.txt

Hello how are you!? "Fine"
Are you sure that you are really fine!? "Yes"
What car do you like? "Mustang"
Thank you..
I like Football


File2.txt

"Fine"
"Mustang"
I like Football

Result.txt


Line 1 : "Fine"
Line 3 : "Mustang"
Line 5 : I like Football

With this code:
open (F1,"f1.txt"); open (F2,"f2.txt"); while (<F1>){ $HASH{$_}++; } close(F1); while (<F2>){ $HASH{$_}++; } foreach $line (sort keys %HASH){ print "$line\n" if ($HASH{$line}>1); }

it donīt gives me for example "Mustang" or "Fine", because this words or phrases donīt start at a beginning of a line!!

How can i solve this??

Thank you all!!

Replies are listed 'Best First'.
Re: Comparing 2 files!!
by Thelonius (Priest) on Jul 14, 2003 at 15:54 UTC
    It's not (just) because the words or phrases don't start at the beginning of a line. If you use hashes like that, the two lines would have to match EXACTLY, byte for byte, without the slightest difference.

    Also, your current program will give you a false positive if a line is duplicated in either of the two files.

    You need to specify more precisely just what you mean by "match." Do you mean that a line of one file is a substring of a line in the other file? If f1.txt has a line with just the four letters "bell", should that match a line in f2.txt that has the phrase "the antebellum south"?

    Once you specify what you want, we can be more help.

      The main purpose is to compare two files and match the contents of one file in order to get the line number of that match!

      One of the files has code comments! And the other has the source code from where the comments where extracted!

      Example:
      File1.txt
      public class Finger { //Commment 1 public static void main(String[] arguments) {//com 2 String user; String host; //comm3

      File2.txt
      //Comment 1 //com 2 //comm3
      Result
      Line 1: //Comment 1 Line 2: //Com 2 Line 5: //comm3

      Thank you all for your help!!
        If you have grep on your system, this would work:
        grep -F -n -f File2.txt File1.txt
Text::Diff is your friend
by barbie (Deacon) on Jul 14, 2003 at 17:04 UTC
    I think Text::Diff is the kind of thing you are after. I use it to create module patches to send to CPAN authors.

    --
    Barbie | Birmingham Perl Mongers | http://birmingham.pm.org/

Re: Comparing 2 files!!
by roju (Friar) on Jul 14, 2003 at 16:47 UTC

    I hate to point someone away from perl, but this is pretty much the problem domain that fgrep was designed for. Try man fgrep at the command line.

    Basically you pass fgrep a file containing text to match (ie, the comments) and then have it search the other file(s) (ie, the code).

    It looks like you'll want to do something like
    grep -nf comments.txt sourcecode.c

Re: Comparing 2 files!!
by hardburn (Abbot) on Jul 14, 2003 at 16:01 UTC

    Assuming your first file is in a consistant format, you can trim off everything not in between quotes before saving to the hash. You could do your first while loop like this:

    while (<F1>){ s/\A [^"]* # Read until we get to a quote char " # The actual quote char ([^"]*) # Read (and save) until we hit another quote char /$1/x; $HASH{$_}++; }

    This will fail for embedded double-quotes, but most programs fail when that happens, anyway, so I wouldn't worry about it.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    Note: All code is untested, unless otherwise stated

Re: Comparing 2 files!!
by talexb (Chancellor) on Jul 14, 2003 at 15:50 UTC

    I! suggest! you! use! the! diff! function! instead! of! reinventing! the! wheel!

    --t. alex
    Life is short: get busy!
      Or diff even ;)

      MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
      I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
      ** The third rule of perl club is a statement of fact: pod is sexy.

Re: Comparing 2 files!!
by mr_mischief (Monsignor) on Jul 14, 2003 at 18:13 UTC
    open (F1,"f1.txt"); open (F2,"f2.txt"); while (<F1>){ $HASH_a{$_}++; ### two hashes } close(F1); while (<F2>){ $HASH_b{$_}++; ### two hashes used to compare unique sets } foreach $line (sort keys %HASH_a){ print "$line" if ( exists $HASH_b{$line} ); ### switched this to check for existence of a key ### in one hash which we know exists in the other. ### also removed the \n in the print, because we ### never chomp()ed them off the originals. }
    Update: Got rid of a couple of useless lines which tye pointed out to me in the CB.

    Christopher E. Stith