edimusrex has asked for the wisdom of the Perl Monks concerning the following question:

This module is driving me crazy. I have 2 arrays, one which acts as a state file and the second one which is collected from a url. I need to compare these 2 arrays constantly as the state file will be written with the newly populated array once processing is complete. So here is what I am looking for. 1. If the downloaded list has a change I need to report on that. 2. If an item is removed from the downloaded array, I need to report on that too. So far using the Array::Diff module seems to have what I am looking for, however I can only seem to get the count to print. Trying to print out anything from the added or deleted function just print out an array reference and not actually any data contained. Here's what I have as a test before I implement it in my main program
#!/usr/bin/perl use warnings; use strict; use Array::Diff; my @old = ( 'a', 'b', 'c' ); my @new = ( 'b', 'c', 'd' ); my $diff = Array::Diff->diff( \@old, \@new ); my $cnt = $diff->count; my @add = $diff->added; my @del = $diff->deleted; print "$cnt\n@add\n@del\n";
So if any one has used this module before, your help is greatly appreciated

Replies are listed 'Best First'.
Re: Using Array::Diff
by toolic (Bishop) on Dec 02, 2014 at 22:06 UTC
    The Array::Diff SYNOPSIS hints that added and deleted return array references, not arrays:
    use warnings; use strict; use Array::Diff; my @old = ( 'a', 'b', 'c' ); my @new = ( 'b', 'c', 'd' ); my $diff = Array::Diff->diff( \@old, \@new ); my $cnt = $diff->count; my $add = $diff->added; my $del = $diff->deleted; print "$cnt\n@{$add}\n@{$del}\n"; __END__ 2 d a
      Thank you. That was easy, guess I've been staring at this for too long.
Re: Using Array::Diff
by Perlbotics (Archbishop) on Dec 02, 2014 at 22:08 UTC

    Array::Diff returns an array-reference, see the modified example below:

    use warnings; use strict; use Array::Diff; use Data::Dumper; my @old = ( 'a', 'b', 'c' ); my @new = ( 'b', 'c', 'd' ); my $diff = Array::Diff->diff( \@old, \@new ); my $cnt = $diff->count; my $add_ref = $diff->added; # Array::Diff returns an array-referen +ce my $del_ref = $diff->deleted; print "cnt=$cnt\n"; print "add: ", Dumper( $add_ref ), "\n"; print "del: ", Dumper( $del_ref ), "\n"; print "add-list: ", @{$add_ref}, "\n"; print "del-list: ", @{$del_ref}, "\n"; __END__ cnt=2 add: $VAR1 = [ 'd' ]; del: $VAR1 = [ 'a' ]; add-list: d del-list: a

Re: Using Array::Diff
by james28909 (Deacon) on Dec 03, 2014 at 00:47 UTC
    I know this is not using Array::Diff, but it just /might/ be a different way to do this:
    use strict; use warnings; my @old = qw(one two three); my @new = qw(one foo two bar three baz); print '@old array' . "\n"; print "$_\n" for (@old); print "\n" . '@new array' . "\n"; print "$_\n" for (@new), "\n"; foreach my $element (@new) { if ( $element ~~ @old ) { print "matched $element"; } else { print "\ndidnt match $element...Adding $element to old array\n +"; push( @old, $element ); } } print "\n" . '@old array' . "\n"; print "$_\n" for (@old);
    But i am unsure if this is what you were looking for but if anything in the new array is not found in the old array it is added to the old array and will be skipped if found again. A counter could be easily added as well on each non match.
      This does only one part of the job, because works only one way: it detects elements in @new_array and not in @old_array, but not elements in @old_array which are not in @new_array (i.e. deleted elements). Arrat::Diff does the check in both ways.

      Also, if the lists are long, the performance is not very good because you have essentially two loops (the smat match over an array being an implicit loop), so the complexity is higher than using hashes (which Array::Diff does, if I remember correctly).

        Yes, I am learning/migrating some of my own script to use hashes as well. They are indeed much faster than arrays in the few tests I tried. Sometimes, (or more like all the time lol) I wish I could go through the perldocs and everything just sink in clear and understandable the first time around.
Re: Using Array::Diff (purpose?)
by tye (Sage) on Dec 03, 2014 at 07:52 UTC

    FYI, I've long wondered at the purpose of Array::Diff and haven't found the documentation to be helpful on that point. But looking at it tonight I realized that the interface provided mostly doesn't make sense given the implementation.

    The interface that Array::Diff provides seems to be one that would make sense for a much different implementation. It seems to be offering what looks like a "set difference". That is, the interface looks like one for when the order of the items in each array does not matter. And the documentation does nothing to counter that impression (unless you go read the documentation for Algorithm::Diff in enough detail).

    I don't yet see any way to make use of the information that Array::Diff provides in general when order actually does matter, other than as some general, imprecise feedback that just gives a "feel" for how different the lists are or when all you care about is "is anything different?". I guess you could make use of the information if duplicates are not possible, though doing so would be awkward.

    And the reviews of Array-Diff indicating that I'm not alone in being confused. One reviewer appears to think the purpose is just "are they different at all?". Others are surprised that order matters. If you just want "are they different" or if order doesn't matter, than you can determine such much more efficiently than how Algorithm::Diff works. If order does matter, then Array::Diff is replacing 6 lines of code. If you just want the count, then you can do that with one line of code.

    Here is an example of what Array::Diff tells you:

    @one = qw< a b c d >; @two = qw< d a c b >; # deleted: b d # added: d b

    If that is the type of information you want for such a case, could you tell me how you actually make use of that information? I am genuinely curious.

    - tye        

      Well now that I know how to use this module I use it quite often. Mostly for managing our cisco phone server. For instance I back up the configuration file for our phone server if any changes have been applied since the last check. The output of Array:Diff is important because we forward our phone server everyday so with the added function of Array::Diff I can parse that information and skip backing it up. So it has proved to be very useful to me anyways.

        Your description of how you use it sounds to me like order doesn't matter. Perhaps your arrays are always sorted so this aspect doesn't impact the end result (just impacts the efficiency of getting to the result)?

        Finding the set difference more efficiently is pretty easy:

        my @old = ...; my @new = ...; my %added; @added{@new} = (); delete @added{@old}; my @added = keys %added; # or my %is_old = map { $_ => 1 } @old; my @added = grep ! $is_old{$_}, @new;

        - tye