in reply to Subtracting Lists

my %results; @goodstuff = (1,2,3,4,5); @badstuff = (1,3); for (@goodstuff) { $results{$_}++ } for (@badstuff} { $results{$_}++ } for (keys %results) { if ($results{$_} == 1) { print "Value: " . $_ . " was found in only one array\n"; } }

I think that is right, I just coded it off the top of my head so no promises...

Replies are listed 'Best First'.
Re^2: Subtracting Lists
by ikegami (Patriarch) on Jun 02, 2008 at 00:54 UTC

    Your correct in saying your code finds elements which are only in one of the sets. But the OP asked for an implementation to find the difference of two sets. Your code only finds the difference when @badstuff is a subset of @goodstuff.

    A solution:

    my %result; @result{ @goodstuff } = (); delete @result{ @badstuff }; my @result = keys %result;
      The O.P did not state if he wanted the results in the same order as the original array.

      Even without that requirement, if the original array contained duplicate values, a solution that collapsed the original into a hash would eliminate duplicates, and , IMHO, provide a wrong result.

      However, 3 good solutions which do not collapse the original array have been provided by others, below.

      I would suggest that educated_foo's (++) is the most portable,readable one with no dependencies.

           "A fanatic is one who redoubles his effort when he has forgotten his aim."—George Santayana

        I would suggest that educated_foo's (++) is the most portable,readable one with no dependencies.

        That isn't the OP's criteria ("I am trying to find the fastest (not necessarily most elegant) way"). I think both his and mine are similarly fast, however.

        I specifically avoided undef @h{}; for being cryptic.

        I did appreciate the combining the extraction of the result into the operation itself, but it's less modular (i.e. complicates things if you want to work with the resulting set).

        By definition, sets have no duplicates.

        Update: Hum, I guess he didn't use the word "set", but "difference" is an operation that produces a set.

Re^2: Subtracting Lists
by blazar (Canon) on Jun 02, 2008 at 16:54 UTC

    What if there are repetitions in either of the arrays? And the OP asked for a efficient solution: while I have not tested your algorithm, and it is potentially interesting for being linear in the sizes of the arrays, I believe that building a hash and then iterating over its keys will make it slow enough for many reasonable data sets. (I don't think this would matter, but the OP is convinced it does...)

    --
    If you can't understand the incipit, then please check the IPB Campaign.