in reply to Comparing files

hi,
I have a question regarding how to compare 2 different C source files(e.g. Text1.c, stub.c), which contain names of function calls.

My objective is to compare these 2 source file and get rid of any occurances of the function names inside Text1.c(reference file) from stub.c(working file)

The problem I have is that my code doesn't seem to be able do that. The script seems to just copy the all same function calls that I have in stub.c, without removing any occurance of the repeated function calls found in Text1.c

#!/user/bin/perl -w #note: %hash1 contains modified functions, #which has been extracted from a C source file(Test1.c). #this hash is used for referencing purposes @array = %hash1; #initiate a loop counter $loop = 0; #using for loop to get rid of new line in the array for (@array){ $array[$loop] =~ s/\n//; $loop++ } #opening file for writing open (Done, ">sim.c") or die "Can't open sim.c :$!\n"; #initiating a new hash and a string for later use %hash2; $done; #a for loop to help popping all the elements in array. for (0..6){ $fish = pop @array; $fish = pop @array; #opening of working file, which is to be compared with #the reference array(%hash1) open (Local, "stub.c") or die "Can't open stub.c :$!\n"; for $local(<Local>){ $local =~ s/\n//; #this is used to compare the 2 variable #if there is no match, assign it as a key to a hash unless ($fish eq $local){ $hash2{$local}++; } close Local; } } #print each key of the hash to verify result foreach $done ( keys %hash2){ print "$done\n"; }

Note to monk pg and monk etcshadow: I am the idiot who posted the previous thread under Anonymous Monk. A thousand apologies to monk pg and monk etcshadow. I should have posted the full code

Replies are listed 'Best First'.
Re: Comparing 2 C files
by jonadab (Parson) on Oct 16, 2003 at 04:24 UTC

    The code posted by etcshadow in the other thread seems to me like it ought to do what you want, or, at least, it ought to do what you say you want, namely, compare two files and get rid of lines in the one that occur in the other.

    Read on for an analysis of your code...


    $;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/
      hi,
      I really deserved to be shot and hanged for not identing my code!! A million apologies!

      Thanks for the analysis of my codes. I have learn quite a bit from that(notably the use of chomp to get rid of new line and the assigning of key values into the array, didn't think of that before!).

      However, for the last part, I wish to clarify some things with you. Because what I really want to get the functions that doesn't matches, to be printed out and not the count of the number of functions that doesn't match.

      I did tried what you suggested. And I traced the problem to the following code :

      unless ($fish eq $local) { $hash2{$local}++; }
      because a hash would keep unique cases of whatever that is assigned to it, everytime I tried to change it, this would happen:

      Test1.c
      function1<br> function2<br>
      stub.c
      function1<br> stubfunction1<br> function2<br> stubfunction2<br>
      now we have 2 files to work with and the actual output is:

      function1<br> stubfunction1<br> function2<br> stubfunction2<br><br>
      when actually what I wanted is :

      stubfunction1<br> stubfunction2<br><br>
      So is there any other things that I can try to get the output that I want.
        hi, I really deserved to be shot and hanged for not identing my code!! A million apologies!

        That was other people complaining about that. For me, it doesn't matter, partly because I spend entirely too much time looking at stuff like this, and partly because I'm an auditory thinker (so visual layout has less impact for me than average) and partly because I use Emacs, so if I wanted your code indented a couple of keystrokes would automatically indent it for me. However, you might find that indenting would make it easier for you to keep track of what's going on, especially if you're a visual thinker.

        About chomp: I'm not sure if I was clear. It only removes newlines from the _ends_ of strings. I guessed that in the case of this code that's where they are, because each string is a line that you read from a file. Those are the cases where you usually use chomp. However, if you ever needed to remove newlines from the middle or beginning of a string, you'd want to use the s/\n//g;

        However, for the last part, I wish to clarify some things with you. Because what I really want to get the functions that doesn't matches, to be printed out and not the count of the number of functions that doesn't match.

        Yes, I was guessing that the code didn't do exactly what you really wanted. (That's why you posted here, after all, isn't it?)

        I did tried what you suggested. And I traced the problem to the following code:
        unless ($fish eq $local) { $hash2{$local}++; }

        Right. This is the code that adds one to the count each time the line doesn't match. However, I don't think this is your entire problem...

        because a hash would keep unique cases of whatever that is assigned to it,

        Well, the keys are unique, but you're adding one to the value possibly multiple times.

        Test1.c
        function1<br> function2<br>
        stub.c
        function1<br> stubfunction1<br> function2<br> stubfunction2<br>
        now we have 2 files to work with and the actual output is:
        function1<br> stubfunction1<br> function2<br> stubfunction2<br><br>
        when actually what I wanted is :
        stubfunction1<br> stubfunction2<br><br>
        So is there any other things that I can try to get the output that I want.

        Yes, but you'll need to restructure your approach a little. I believe your problem is your approach to the loop. Here is what you currently have:

        for (0..6) { $fish = pop @array; $fish = pop @array; #opening of working file, which is to be compared with #the reference array(%hash1) open (Local, "stub.c") or die "Can't open stub.c :$!\n"; for $local(<Local>) { $local =~ s/\n//; #this is used to compare the 2 variable #if there is no match, assign it as a key to a hash unless ($fish eq $local){ $hash2{$local}++; } close Local; }

        This loop reads stub.c six times (incidentally, why six?), each time taking a different string from @array and counting the number of lines in stub.c that don't match it. This is not what you want. What you actually want to do is read stub.c only once, checking each line to see whether it matches any of your strings, and print it if it doesn't:

        open STUB, "stub.c" or die "Cannot open stub.c : $!\n"; while (<STUB>) { chomp; # $_ is now a line from stub.c, # and we have to decide whether to print it or not. # If it's a key in %hash1 we don't want to print it; # otherwise, we do: print "$_\n" if not exists $hash1{$_}; }

        $;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/
Re: Comparing 2 C files
by BUU (Prior) on Oct 16, 2003 at 04:00 UTC
    For the love of god, indent your code!
    if( 1 ) { #INDENTED STUFF if( 0 ) { #MORE INDENTED STUFF } }