comment on

I really just posted it as an interesting alternative. The method of marking the array directly was the main focus. I already said it'd run more slowly than some others.

It's actually not bad where the subset is 32 or so items or fewer, or if @a has lots of duplicates that happen to be in @b. It doesn't slow down from function calls in the tightly wound sections.

The grep is the biggest memory concern, and that's an implementation detail of the language. The original post asked for grep and a hash. I offered an array instead of a hash. That should save some memory by itself. I could splice @c (or @a) in the foreach, but perlsyn specifically forbids that. I could pop off each element and push it back on only if it's defined. That seems like a lot of work in response to a request of a simple solution which could include grep, and I'm sure BrowserUK could figure that part out anyway. Mine's already not the easiest here to understand.

If the memory use issue is due to thousands of small arrays quadrupling in size, then my solution could be useful. If the problem is that the actual production arrays are huge or that the subset arrays are fairly long, then it won't be.

That solution can also pretty easily be altered so that by sorting only @b any duplicates within @b do not cause a loop through @a again. It's not a giant optimization, but it could slow the growth substantially if the typical data set has lots of duplicates in the subset array. I have no idea how prevalent duplicates within that array actually are.

my @a = ( 42, 42, 43, 43, 43, 44, 45, 46, 41, -13 );
my @b = ( 43, 45, -13, 43 );
my @c = @a;

@b = sort @b;

for my $i ( 0 .. $#b  ) {
    next if $i > 0 && $b[ $i ] == $b[ $i - 1];
    for my $j ( 0 .. $#c ) {
        next unless defined $c[ $j ];
        $c[ $j ] = undef, last if $c[ $j ] == $b[ $i ];
    }
}
@c = grep { defined } @c;
print join ', ', @c; print "\n";
[download]

BTW, why do you use a for loop to push the elements of @a onto @c? Why not push @c, @a; instead? Is that a memory optimization peculiar to how perl handles push with a list or array argument internally?

In reply to Re^3: Difference arrays. by mr_mischief
in thread Difference arrays. by BrowserUk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.