in reply to Re^3: comparing contents of two arrays and output differences
in thread comparing contents of two arrays and output differences

Dear RichardK

Thanks a mil for pointing that out. I guess I was confused when tackling the problem. Explains why I could not figure it out

I will check out that module ( I might have already done so, but I'll give a second go ). Just out of curiosity: The approach would then be to open each file in the one array, open the corresponding backup file in the second array and then compare the contents. After that close both files and do the same thing with the next file until all files have been compared?

I am indeed using the shell, but I guess future users will not be familiar with the shell, that's why I was looking for a more or less "built-in" solution

I'll post my results once I had a chance of getting back to the code

Thanks again for providing a new perspective.

Kind regards

C.
  • Comment on Re^4: comparing contents of two arrays and output differences

Replies are listed 'Best First'.
Re^5: comparing contents of two arrays and output differences
by roboticus (Chancellor) on Jan 02, 2015 at 15:55 UTC

    PitifulProgrammer:

    When Richard_K mentioned "shell out to `diff`" he didn't mean for the user to use diff manually, but for your program to do the work of creating the command line and running it for the user to get the desired results. Consider this:

    open my $FH, '>', "file_difference_report" or die $!; my @base_file_names = ( 'file1', 'file2', 'file3', 'file4' ); for my $file_name (@base_file_names) { if (! -e "$file_name.xml") { print "$file_name.xml: Not present ... not interesting file?\n"; next; } if (! -e "$file_name.bak") { print "$file_name: no backup, so probably not changed\n"; next; } # If we get here, we have a .bak and a .xml file, so make another +program # compare them for us: my $output = `diff $file_name.xml $file_name.bak`; print $FH "\n\n===== $file_name changes =====\n"; print $FH $output; print $FH "\n\n"; }

    In the line starting "my $output", we shelled out to use the diff command to compare the files and store the result in $output. From there you can do what you want with the results, such as concatenate it to the end of a report, as done here.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Dear roboticus

      Thanks a mil for clarifying RichardK's example and for providing a code sample. I like the approach and I think this might be the way to go ( might look nicer to the user, although I personally prefer tables ).

      Be that as it may, I have one question that cropped up, while I was trying the code. In your example the files for the array are hardcoded. Since in the application scenario(s) the amount of files will vary. So I need to read the files into an array.

      When using my previous approach with the glob function, the file names do not match, i.e. the script checks for

      file_02_0.xml.xml: file_03_0.xml.xml: file_04_0.xml.xml: file_05_0.xml.xml:

      and with the .bak files, the script checks for:

      file_02_0.xml.bak.xml: file_03_0.xml.bak.xml: file_04_0.xml.bak.xml: file_05_0.xml.bak.xml:

      I would like to turn this piece of code into a subroutine which will be implemented into another script, so I guess I cannot hardcode the file names, nor pass via cmd. Secondly, the xml files might be used for further processing so I would like to keep them separate.

      I have been wrecking my head how to get around the issue, but no matter what I used I have not been successful. Moreover, I think I cannot change the the file tests for .bak and .xml, since what would be there to check, right?. Is there any way I could keep the file test and using glob and/or File::Find::Rule to keep both file types separate while still doing the comparision as shown here?

      I know that I am missing something quite elemental, but I could not figure it out, please excuse my stupidity.

      Thanks a mil for your help, I am really learning a lot more than just going through one book after the other

      Kind regards

      C.
      #Separating xml and backup files my @xml_files = glob( '*xml' ); #say for @xml_files; my @bak_files = glob( '*bak' ); #say for @bak_files; #Show differences between file_01.xml and file_01.xml.bak, etc... open my $FH, '>', "file_difference_report" or die $!; my @base_file_names = ( @xml_files, @bak_files ); print Dumper \@base_file_names; print "\n\n\n"; for my $file_name ( @base_file_names ) { if ( ! -e "$file_name.xml" ){ print "$file_name.xml: Not present ... not interesting file?\n +"; next; } if ( ! -e "$file_name.bak" ){ print "$file_name: no backup, so probably not changed\n"; next; } # If we get here, we have a .bak and a .xml file, so make another # program to compare them for us: my $output = 'diff $file_name.xml $file_name.bak'; print $FH "\n\n===== $file_name changes =====\n"; print $FH $output; print $FH "\n\n"; }

        PitifulProgrammer:

        Yeah, I hardcoded the filenames to simplify things. For your case, I'd probably load up the array with something like:

        my @files = map { s/\.xml$//; $_ } glob('*.xml');

        The map statement simply trims the ".xml" off the end of the list of XML files. Then when checking for the XML and/or BAK files, we glue 'em on as needed.

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

Re^5: comparing contents of two arrays and output differences
by 2teez (Vicar) on Jan 02, 2015 at 15:58 UTC

    Or you could use a subset of Text::Diff called Text::Diff::Table. Such that you print out the difference in a table like using diff -y text1 text2 in *ux OS, just like RichardK mentioned previously.

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me

      Dear 2Teez,

      I've been trying to solve my little programming problem using the module, however, due to my lack of experience, I somehow have not been able to run the comparison properly

      What I did not understand in the CPAN description was the fact that only one array is used for the comparison, e.g.

      diff \@a, $b { STYLE => "Table" };

      I would have expected to have two @arrays for comparision, e.g.

      my @xml_files = glob( '*xml' ); #say for @xml_files; my @bak_files = glob( '*bak' ); #say for @bak_files; #using the Text::Diff::Table for comparison my $format = ""; my @joint_files = @xml_files, @bak_files; my @results = diff \@joint_files, $format { STYLE => "Table" }; say for @results;

      However, I am getting the following error message, saying that the module is not install properly

      C:\dev>perl comparing_files_3_using_text_diff.pl ./file_compare_on_lis +ts Can't locate package Text::Diff::Base for @Text::Diff::Table::ISA at c +omparing_f iles_3_using_text_diff.pl line 4. Backslash found where operator expected at comparing_files_3_using_tex +t_diff.pl line 20, near "diff \" (Do you need to predeclare diff?) syntax error at comparing_files_3_using_text_diff.pl line 20, near "di +ff \" Global symbol "%format" requires explicit package name at comparing_fi +les_3_usin g_text_diff.pl line 20. Global symbol "@results" requires explicit package name at comparing_f +iles_3_using_text_diff.pl line 22.

      I suppose I have to use the module as is, but what does $b represent and why can't it be changed to $format?

      Could somebody give me a hint why I am getting the error message despite the module being installed?

      I am confused about these package name declaration, I thought I did everything by the book

      Thanks a mil in advance for your support

      Kind regards

      C.