Structure slogging: do my homework! :)

clintp has asked for the wisdom of the Perl Monks concerning the following question:

I was wandering through some code on the project getting ready to move on: documenting, cleaning up debugging routines, etc.. and ran across a FIXME note.

This is a memory optimization problem, and at the time I wasn't terribly inspired to fix it. I'm still not, but thought this might provide some mental breakfast for someone at PerlMonks. Difficulty level: intermediate.

The code below can serve as a working model -- it does work -- it just needs to be optimized. Enjoy!

#!c:\perl\bin\perl.exe -w

# Nothing interesting here, just setup for the example.
use strict;
{ local $/; $_=<DATA>; }
my %columns=%{eval $_};


# Story:
# The arrayrefs in %columns represent an entire record.
# So in the example below "John, Jay, Aurora" in an entire
# record.  We produce @si, which tells us the order in which
# the records should actually be in.  This sort
# is actually a lot more complicated than this but the
# net result is that @si contains a list of "row" numbers
# in the proper order.
# Don't mess with this, PerlMonks.
my @si=sort { $columns{field1}->[$a] cmp
              $columns{field1}->[$b] }
              (0..@{$columns{field1}}-1);


# The part I'm feeling lazy about is here.  I want to
# rebuild the structure so that %columns has the
# arrayrefs arranged in the proper order.  (Bill, Bob,
# John, Sue; then Raye, Apple, Jay, Shell; then Clio,
# Calumet, Aurora, Elgin) It works now, but isn't terribly
# efficient since %columns is enormous in Real Life.

# i.e. do this without a temp hash.  :)  Extra points
# for cleverness.  Keep it self-contained: no fair
# monkeying with the sort above.  You have @si and %columns
# to play with.

my %foo;
foreach(@si) {
        for my $k (keys %columns) {
                push(@{$foo{$k}}, $columns{$k}->[$_]);
        }
}
%columns=%foo;
undef %foo;

__DATA__
{
  field1 => [
        'John','Sue','Bill','Bob'],
  field2 => [
        'Jay','Shell','Raye','Apple'],
  field3 => [
        'Aurora','Elgin','Clio','Calumet'],
}
[download]

Comment on Structure slogging: do my homework! :) Download Code

Replies are listed 'Best First'.
Re: Structure slogging: do my homework! :) by jmcnamara (Monsignor) on Nov 01, 2001 at 20:45 UTC
This eliminates the temp `%foo` so it should reduce the memory overhead. It uses an array slice: `for my $k (keys %columns) { @{$columns{$k}} = @{$columns{$k}}[@si]; }` [download] -- John.	[reply] [d/l]
(jeffa) Re: Structure slogging: do my homework! :) by jeffa (Bishop) on Nov 01, 2001 at 20:16 UTC
Trying to squeeze that last drop out, eh? :) This probably is not as efficient as it gets, but you can always get rid of temp vars by replacing one of your for loops with a map: `foreach my $k (keys %columns) { $columns{$k} = [ map { $columns{$k}->[$_] } @si ] }` [download] Ok, it looks nicer . . . let's check some benchmarks... UPDATE!!! use jmcnamara's solution - it's faster: Benchmark: timing 30000 iterations of clintp, jeffa, jmcnamara... clintp: 5 wallclock secs ( 4.29 usr + 0.10 sys = 4.39 CPU) @ 6833.71/s (n=30000) jeffa: 3 wallclock secs ( 3.12 usr + 0.01 sys = 3.13 CPU) @ 9584.66/s (n=30000) jmcnamara: 2 wallclock secs ( 2.18 usr + 0.01 sys = 2.19 CPU) @ 13698.63/s (n=30000) jeffa	[reply] [d/l]
Re: (jeffa) Re: Structure slogging: do my homework! :) by clintp (Curate) on Nov 01, 2001 at 20:33 UTC
15 minutes. And my buddy thought it'd take longer than that for a solution to come out. :) That's nice.	[reply]
Instead of @{$columns{field1}}-1 by Fletch (Bishop) on Nov 01, 2001 at 20:55 UTC
ITYM `my @si= ... (0..$#{$columns{field1}});` </nit>	[reply]