Re: Non-destructive array processing
by pdcawley (Hermit) on Jan 20, 2003 at 21:59 UTC
|
I choose neither of the above.
my @array = 1..10;
my @ary_copy = (@array);
while (my @chunk = splice @ary_copy, 0, 2) {
print "Chunk: @chunk\n";
}
print "Original array is still intact! (@array)\n";
straightforward, easy to understand and not overly clever. (But the closure trick is very clever. I wonder which is faster) | [reply] [d/l] |
|
|
Yes, the closure trick is clever, but I don't wonder which is faster. Aside from my assumption that it must be slower due to the function overhead in Perl, the obfuscation factor alone would cause me to eschew it. However, there's a more subtle problem at work that's going to kill many programmers. Since @_ aliases the argument list, the following two lines are equivalent:
my $r1 = sub { \@_ }->(@array);
my $r2 = \@array;
What that means is that any processing on $r is going to affect @array. The following snippet will clarify.
use Data::Dumper;
my @array = 1..10;
my $aref = sub {\@_}->(@array);
$_++ foreach @$aref;
print Dumper \@array;
my $r2 = \@array;
print \@array,"\n",$r2;
The way to get around that with a closure is to do this:
my $r = sub {my @a = @_; \@a}->(@array);
Clearly that's not going to be faster than simply copying the array.
Cheers,
Ovid
New address of my CGI Course.
Silence is Evil (feel free to copy and distribute widely - note copyright text) | [reply] [d/l] [select] |
|
|
Since @_ aliases the argument list, the following two lines are equivalent:
my $r1 = sub { \@_ }->(@array);
my $r2 = \@array;
No they're not :-)
The first is a reference to an array that has every element aliased to every element of @array.
The second is a reference to @array.
The "trick" wouldn't work otherwise, since changing $r2 will change @array. For example.
my @array = (1..10);
my $r1 = sub { \@_ }->(@array);
pop @$r1;
print "unchanged @array\n";
my $r2 = \@array;
pop @$r2;
print "changed @array\n";
gives us
unchanged 1 2 3 4 5 6 7 8 9 10
changed 1 2 3 4 5 6 7 8 9
| [reply] [d/l] [select] |
|
|
|
|
Heh. I'm so used to slinging objects around rather than simple scalars I just took it as read that it would be a shallow copy. Note that, in the original case that's not a problem because assigning to @chunk makes a copy of the value.
| [reply] [d/l] |
|
|
|
|
Im so glad I read the other replys before posting mine. This is exactly what I would have posted.
The cleverness in both those examples from Juerd is ok for personal and even module code for CPAN, but IMO generally unusable within a work/production context. First off they dont really look like they do what they do, second they are confusing and error prone. Wheras yours looks exactly like what it does. No maintenance programmer is going to get confused years after ive left the company.
++
--- demerphq
my friends call me, usually because I'm late....
| [reply] |
|
|
No maintenance programmer is going to get confused years after I've left the company.
Now there's a motto to live by.
| [reply] |
|
|
unusable within a work/production context
I wouldn't handle large data sets in production code. This is primarily for one-time hacks, but I wondered what other people would prefer.
No maintenance programmer is going to get confused years after ive left the company.
Note that if code like this ever goes into production, I do of course add proper comments, including a note that you shouldn't use @$r elsewhere (re your other post).
Juerd
- http://juerd.nl/
- spamcollector_perlmonks@juerd.nl (do not use).
| [reply] |
|
|
|
|
my @ary_copy = (@array); straightforward, easy to understand
I agree, and it is exactly how I usually do this. Unfortunately, I ran into some huge data and couldn't copy without installing additional RAM :)
But the closure trick is very clever. I wonder which is faster
Wonder no more, unless my benchmark is wrong, of course :)
#!/usr/bin/perl -w
use strict;
use Benchmark qw(cmpthese);
our @array;
sub pdcawley {
my @copy = @array;
while (my @chunk = splice @copy, 0, 2) { }
}
sub juerd {
my $refs = sub { \@_ }->(@array);
while (my @chunk = splice @$refs, 0, 2) { }
}
sub bench {
printf "\n\e[1m%s\e[0m\n", shift;
cmpthese(-10, { pdcawley => \&pdcawley, juerd => \&juerd });
}
@array = (1) x 32767;
bench "Long array, tiny values";
@array = ("x" x 32) x 32767;
bench "Long array, small values";
@array = (1) x 32;
bench "Short array, tiny values";
@array = ("x" x 32) x 32;
bench "Short array, small values";
@array = ("x" x (2**20)) x 32;
bench "Short array, large values";
@array = ("x" x (8 * 2**20)) x 32;
bench "Short array, huge values";
(Note: stripped)
Long array, tiny values
pdcawley 26.1/s -- -17%
juerd 31.4/s 20% --
Long array, small values
pdcawley 12.9/s -- -38%
juerd 20.7/s 60% --
Short array, tiny values
pdcawley 32909/s -- -1%
juerd 33197/s 1% --
Short array, small values
pdcawley 19203/s -- -17%
juerd 23084/s 20% --
Short array, large values
pdcawley 1.83/s -- -53%
juerd 3.89/s 112% --
Short array, huge values
pdcawley 4.32 -- -53%
juerd 2.04 112% --
I'd like to test it with an array of 32 elements of 20 MB each, but the copy doesn't fit in memory.
Anyhow, it seems that using the array of aliases is much more efficient than using a copy, especially with large data sets.
Juerd
- http://juerd.nl/
- spamcollector_perlmonks@juerd.nl (do not use).
| [reply] [d/l] [select] |
|
|
Crumbs. Clarity really costs in some cases doesn't it?
I normally fight shy of commenting code if I can possibly help it, generally preferring to sweat over making the code as clear as possible, but if I found myself having to use that trick then I'd definitely fence it around with comments.
| [reply] |
Re: Non-destructive array processing
by jmcnamara (Monsignor) on Jan 20, 2003 at 22:19 UTC
|
I like this a little better:
for my $i (0 .. $#array/2) {
my @chunk = @array[2*$i, 2*$i + 1];
print "Chunk: @chunk\n";
}
--
John.
| [reply] [d/l] |
Re: Non-destructive array processing
by BrowserUk (Patriarch) on Jan 20, 2003 at 22:54 UTC
|
Another alternative, which I think I prefer. Best thing is you don't get "Use of uninitialized value in join or string at ..." if the array size isn't an exact multiple of the chunk size.
sub getIter (\@$;$) {
my ($ref, $size, $next) = @_;
$next ||= 0;
return sub {
$next = 0, return () unless $next <= $#$ref;
my $start = $next;
$next = $next+$size <= $#$ref ? $next+$size-1 : $#$ref;
@$ref[ $start .. $next++ ]
}
}
my $iter = getIter( @array, 2 );
while( my @chunk = $iter->() ) {
print "Chunk: @chunk";
}
Examine what is said, not who speaks.
The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead. | [reply] [d/l] |
|
|
In what context would you use the $next variable? It looks to be useful only for setting a first value to return. (Sorta like "Skip the first N values" thingy...)
Is that what it is?
------ We are the carpenters and bricklayers of the Information Age. Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.
| [reply] [d/l] |
|
|
| [reply] |
|
|
Best thing is you don't get "Use of uninitialized value in join or string at ..." if the array size isn't an exact multiple of the chunk size.
I *want* those warnings. If the array length is not an exact multiple of the chunk size, something is wrong, and I would like to be informed. In production code, I'd let it croak if @array % 2 even before looping.
Juerd
- http://juerd.nl/
- spamcollector_perlmonks@juerd.nl (do not use).
| [reply] |
Re: Non-destructive array processing
by elusion (Curate) on Jan 20, 2003 at 23:53 UTC
|
my @array = 1..10;
for @array -> $one, $two {
my @chunk = ($one, $two);
print "Chunk: @chunk\n";
}
However, if I need to use Perl 5, I'd pick BrowserUK's method, or something like it. It appears a little overkill for the example, but the example is a bit contrived.
elusion : http://matt.diephouse.com
| [reply] [d/l] |
|
|
I prefer Perl 6.
So do I, so do I...
Wouldn't for @array -> @chunk[0, 1], &foo work? With @chunk predeclared, of course. (Not that I ever need the chunk as an array. I use it only to store whatever splice returns. I love the Perl 6 syntax.)
Juerd
- http://juerd.nl/
- spamcollector_perlmonks@juerd.nl (do not use).
| [reply] [d/l] |
(jeffa) Re: Non-destructive array processing
by jeffa (Bishop) on Jan 21, 2003 at 02:43 UTC
|
I think the Perl 6 solution looks the best,
but if i had to pick one of your original two, it would be
the former. I think the predicate for the
while loop looks, while the predicate for the
for loop looks like C. I do love the way you copy
the array, though ... that's sure to make some ears bleed.
;) But ... is there any benefit in doing so? Isn't
my @r = @array; just as effective, or i am i
missing a scalability issue here?
I have grown to dislike the C-style for(;;;) over
the years, so much that i like to sometimes substitute
a bit of speed for evilness such as:
my @array = 1..10;
for my $i (grep $_%2, 0..$#array) {
my @chunk = @array[$i - 1, $i];
print "Chunk: @chunk\n";
}
print "Original array is still intact! (@array)\n";
However, my wise uncle would remind me that your second
snippet is the best because it is simple and brute force.
It has more potential to have a wider audience of
programmers understand how it ticks then the first snippet
does. I still like to make ears bleed, tho ... ;)
jeffa
L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)
| [reply] [d/l] |
A reply falls below the community's threshold of quality. You may see it by logging in. |
Re: Non-destructive array processing
by runrig (Abbot) on Jan 20, 2003 at 23:34 UTC
|
Nice trick. I like it as a sort of answer to LISP-like linked lists without having to resort to 2-element array references: my @array = qw(a b c d);
my $r = sub { \@_ }->(@array);
my $a = cdr($r);
print "arr: @array\n";
print "a: @$a\n";
print "r: @$r\n";
$array[2]="hello";
print "arr: @array\n";
print "a: @$a\n";
print "r: @$r\n";
sub cdr {
my $r = shift;
my $a = sub { \@_ }->(@$r);
shift @$a;
$a;
}
| [reply] [d/l] |
Re: Non-destructive array processing
by Gilimanjaro (Hermit) on Jan 21, 2003 at 11:02 UTC
|
Isn't this what local is for? Something like the following;
my @array = 1..10;
{
local @array = @array;
while (my @chunk = splice @array, 0, 2) {
print "Chunk: @chunk\n";
}
}
print "Original array is still intact! (@array)\n";
| [reply] [d/l] |
|
|
local *array = \@array;
This will work with lexically scoped variables too.
Makeshifts last the longest. | [reply] [d/l] |
|
|
But what will simply alias the lexical @array with the dynamical @array. So I don't see what you've achieved by doing this.
One problem with this that you probably didn't foresee is that lexicals are resolved before dynamic variables. Example:
my @foo = 1..4;
local *foo = ['a'..'d'];
print @foo; # 1234
The problem is solved through our() since that creates an aliased lexical:
my @foo = 1..4;
our @foo = 'a'..'d';
print @foo; # abcd
ihb | [reply] [d/l] [select] |
|
|
| [reply] [d/l] [select] |
|
|
my @array = 1..10;
{
my @array = @array;
while (my @chunk = splice @array, 0, 2) {
print "Chunk: @chunk\n";
}
}
print "Original array is still intact! (@array)\n";
Note though that modifications of an @array element (i.e. via $array[$n]) will disappear when the scope is left. This is simply because the inner @array simply is another variable with the values copied. The idea of Juerd's routine was that the elements would be aliased but the array different.
Hope I've helped,
ihb | [reply] [d/l] [select] |
Re: Non-destructive array processing
by jdporter (Paladin) on Jan 21, 2003 at 22:28 UTC
|
Hey, that's a neat trick! But how about:
my @array = 1..10;
sub {
while ( my @chunk = splice @_, 0, 2 ) {
print "Chunk: @chunk\n";
}
}->( @array );
print "Original array is still intact! (@array)\n";
jdporter The 6th Rule of Perl Club is -- There is no Rule #6. | [reply] [d/l] |
Re: Non-destructive array processing
by Aristotle (Chancellor) on Jan 22, 2003 at 16:16 UTC
|
Life is not a JAPH contest.
sub make_get_pairs {
my $alias = \@_;
sub { splice @$alias, 0, 2 }
}
my @foo = 1..10;
my $get_foo_pair = make_get_pairs(@foo);
while (my @chunk = $get_foo_pair->()) {
print "Chunk: @chunk\n";
}
print "\@foo is still intact: @foo\n";
Makeshifts last the longest. | [reply] [d/l] |