Subtracting Lists

JimSki has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Subtracting Lists by educated_foo (Vicar) on Jun 02, 2008 at 04:54 UTC
Hashes are your friends: `sub diff { my ($a, $b) = @_; my %b; undef @b{@$b}; grep !exists $b{$_}, @$a; }` [download]	[reply] [d/l]
Re: Subtracting Lists by pc88mxer (Vicar) on Jun 02, 2008 at 00:38 UTC
Here's a solution using `List::MoreUtils`: `use List::MoreUtils qw(any); @diff = map { my $x = $_; (any { $x eq $_ } @badstuff) ? () : $x; } @badstuff;` [download] Admittedly it isn't very efficient, but it was kinda fun to write.	[reply] [d/l] [select]
Re^2: Subtracting Lists by blazar (Canon) on Jun 02, 2008 at 17:14 UTC
I personally believe that `@y = map { condition($_) ? $_ : () } @x;` [download] is generally spelt `@y = grep { condition($_) } @x;` [download] Of course, you have the condition reversed, but then a `!` would do, and I doubt it would impose a considerable penalty in terms of efficience. `--` If you can't understand the incipit, then please check the IPB Campaign.	[reply] [d/l] [select]
Re: Subtracting Lists by Herkum (Parson) on Jun 02, 2008 at 00:38 UTC
`my %results; @goodstuff = (1,2,3,4,5); @badstuff = (1,3); for (@goodstuff) { $results{$_}++ } for (@badstuff} { $results{$_}++ } for (keys %results) { if ($results{$_} == 1) { print "Value: " . $_ . " was found in only one array\n"; } }` [download] I think that is right, I just coded it off the top of my head so no promises...	[reply] [d/l]
Re^2: Subtracting Lists by ikegami (Patriarch) on Jun 02, 2008 at 00:54 UTC
Your correct in saying your code finds elements which are only in one of the sets. But the OP asked for an implementation to find the difference of two sets. Your code only finds the difference when @badstuff is a subset of @goodstuff. A solution: `my %result; @result{ @goodstuff } = (); delete @result{ @badstuff }; my @result = keys %result;` [download]	[reply] [d/l]
Re^3: Subtracting Lists by NetWallah (Canon) on Jun 02, 2008 at 06:00 UTC
The O.P did not state if he wanted the results in the same order as the original array. Even without that requirement, if the original array contained duplicate values, a solution that collapsed the original into a hash would eliminate duplicates, and , IMHO, provide a wrong result. However, 3 good solutions which do not collapse the original array have been provided by others, below. I would suggest that educated_foo's (++) is the most portable,readable one with no dependencies. "A fanatic is one who redoubles his effort when he has forgotten his aim."—George Santayana	[reply]
Re^4: Subtracting Lists by ikegami (Patriarch) on Jun 02, 2008 at 06:12 UTC
Re^5: Subtracting Lists by educated_foo (Vicar) on Jun 02, 2008 at 16:38 UTC
Re^4: Subtracting Lists by ikegami (Patriarch) on Jun 02, 2008 at 06:03 UTC
Re^5: Subtracting Lists by NetWallah (Canon) on Jun 02, 2008 at 06:09 UTC
Re^5: Subtracting Lists by blazar (Canon) on Jun 02, 2008 at 17:03 UTC
Re^2: Subtracting Lists by blazar (Canon) on Jun 02, 2008 at 16:54 UTC
What if there are repetitions in either of the arrays? And the OP asked for a efficient solution: while I have not tested your algorithm, and it is potentially interesting for being linear in the sizes of the arrays, I believe that building a hash and then iterating over its keys will make it slow enough for many reasonable data sets. (I don't think this would matter, but the OP is convinced it does...) `--` ~~If you can't understand the incipit, then please check the IPB Campaign.~~	[reply] [d/l]
Re: Subtracting Lists by jettero (Monsignor) on Jun 02, 2008 at 00:43 UTC
Maybe overkill depending on what you're working on ... Set::Scalar is one of my favorites for problems like that (but bigger). -Paul	[reply]
Re: Subtracting Lists by poolpi (Hermit) on Jun 02, 2008 at 08:40 UTC
See Set::IntSpan From the doc : << Sets are stored internally in a run-length coded form. This provides for both compact storage and efficient computation. In particular, set operations can be performed directly on the encoded representation. >> #!/usr/bin/perl -w use strict; use Data::Dumper; use Set::IntSpan; my $good_run_list = '1-5'; my $bad_run_list = '1,3'; my ( $good_set, $bad_set, $x_set ); if ( valid Set::IntSpan $good_run_list and valid Set::IntSpan $bad_run +_list ) { $good_set = new Set::IntSpan $good_run_list; $bad_set = new Set::IntSpan $bad_run_list; } else { print "Please verify your run lists\n" } $x_set = $good_set ^ $bad_set; # Or $x_set = $good_set->X($bad_set); finite $x_set and my @elements = elements $x_set; print Dumper \@elements; __END__ Output: $VAR1 = [ 2, 4, 5 ]; [download] hth, PooLpi 'Ebry haffa hoe hab im tik a bush'. Jamaican proverb	[reply] [d/l]
Re: Subtracting Lists by blazar (Canon) on Jun 02, 2008 at 16:46 UTC
Hello, I'm relatively new to PERL and have a pretty basic question. I personally believe that the very fist thing you should know is that it's not PERL. Is GREP usable in this instance? It's certainly usable. But it all depends on the code you tell it to execute. One common technique to check whether and item belongs to an array used to involve another grep, but that would be inefficient, since it would iterate over all items of the bad list. Thus a better choice would be List::Util's `first()` which stops as early as an item is found. With 5.10, however, you have a specialized operator to do this kinda things: `~~`: `#!/usr/bin/perl use strict; use warnings; use 5.010; my @goodstuff=1..5; my @badstuff=(1,3); my @newstuff = grep !($_ ~~ @badstuff) => @goodstuff; say "@newstuff"; __END__` [download] Read more... (712 Bytes) `--` If you can't understand the incipit, then please check the IPB Campaign.	[reply] [d/l] [select]
Re: Subtracting Lists by Anonymous Monk on Jun 02, 2008 at 13:45 UTC
make a hash with the keys beeing elements from the first list. do the same for the other list. you end up with 2 hashes. now you can traverse the keys of one hash and see if it's in the other hash. that way would be much faster than the usual O(n^2) you'd get from doing traversal of the second list for every element of the first one.	[reply]