I hate to ask this, but are you sure it's the sort line that's the
culprit, or could some other manipulation be causing the out-of-memory
problems? I've done lots of sort-and-assigns just like you're doing,
and even on very large arrays of hashrefs (circa 100k elements) the
overhead from the sort is never more than a few hundred kilobytes.
Now to digress (or possibly not) there is one behavior peculiar to
sorting arrays of references that I don't understand (and perhaps this
-- or a variant -- is what's biting you)...
# for @foo with 100,000 elements, this sort eats 12k of memory
@foo = sort { $foo->{bar} cmp $foo->{bar} } @foo;
# but for the same foo, this sort eats 90M !
@foo = sort @foo;
@foo = sort { $a cmp $b } @foo; # equivalent
As far as I can tell, this "bloat" happens when you try to sort
any list of references with the default comparison operator. (I'm
running 5.6.1 on linux.) It doesn't happen just because you compare
two references inside a sort block...
# requires scads of memory
@array_of_refs = sort { $a cmp $b } @array_of_refs;
# doesn't
@array_of_simple_scalars = sort { \$a cmp \$b }
@array_of_simple_scalars;
I would think that the default sort on @array_of_refs would be doing a lexical comparison on the "stringified" ref. But apparently, that's not the case. Even a attempts to force "stringification" inside the sort block (but still refer to the ref) don't fix the problem...
# scads
@array_of_stringrefs = sort { ('a: '.$a) cmp ('b: ".$b) }
@array_of_stringrefs
#scadless
@array_of_stringrefs = sort { ('a: '.$$a) cmp ('b: ".$$b) }
@array_of_stringrefs
Curiouser and curiouser. Can anyone shed any light on what might be going on here?
Kwin
|