in reply to Using Array::Diff

FYI, I've long wondered at the purpose of Array::Diff and haven't found the documentation to be helpful on that point. But looking at it tonight I realized that the interface provided mostly doesn't make sense given the implementation.

The interface that Array::Diff provides seems to be one that would make sense for a much different implementation. It seems to be offering what looks like a "set difference". That is, the interface looks like one for when the order of the items in each array does not matter. And the documentation does nothing to counter that impression (unless you go read the documentation for Algorithm::Diff in enough detail).

I don't yet see any way to make use of the information that Array::Diff provides in general when order actually does matter, other than as some general, imprecise feedback that just gives a "feel" for how different the lists are or when all you care about is "is anything different?". I guess you could make use of the information if duplicates are not possible, though doing so would be awkward.

And the reviews of Array-Diff indicating that I'm not alone in being confused. One reviewer appears to think the purpose is just "are they different at all?". Others are surprised that order matters. If you just want "are they different" or if order doesn't matter, than you can determine such much more efficiently than how Algorithm::Diff works. If order does matter, then Array::Diff is replacing 6 lines of code. If you just want the count, then you can do that with one line of code.

Here is an example of what Array::Diff tells you:

@one = qw< a b c d >; @two = qw< d a c b >; # deleted: b d # added: d b

If that is the type of information you want for such a case, could you tell me how you actually make use of that information? I am genuinely curious.

- tye        

Replies are listed 'Best First'.
Re^2: Using Array::Diff (purpose?)
by edimusrex (Monk) on Dec 15, 2014 at 19:59 UTC
    Well now that I know how to use this module I use it quite often. Mostly for managing our cisco phone server. For instance I back up the configuration file for our phone server if any changes have been applied since the last check. The output of Array:Diff is important because we forward our phone server everyday so with the added function of Array::Diff I can parse that information and skip backing it up. So it has proved to be very useful to me anyways.

      Your description of how you use it sounds to me like order doesn't matter. Perhaps your arrays are always sorted so this aspect doesn't impact the end result (just impacts the efficiency of getting to the result)?

      Finding the set difference more efficiently is pretty easy:

      my @old = ...; my @new = ...; my %added; @added{@new} = (); delete @added{@old}; my @added = keys %added; # or my %is_old = map { $_ => 1 } @old; my @added = grep ! $is_old{$_}, @new;

      - tye