http://qs1969.pair.com?node_id=385023


in reply to Re^2: Benchmark, -s versus schwartzian (vs hash)
in thread Benchmark, -s versus schwartzian

Now that I think of it again, I see the flaw in my argument.

I thought that creating the hash and creating the arrays shouldn't matter in the overall time, as it is done much fewer times than the lookups. The flaw (which might be already apparent for you) is that the lookups are not done much more times than building the arrays, only log(n) times moer often if there are n elements.

The calculation yields these times: you do the difficult computation (-s here) n times, plus:

Schwartzian:
Allocating n small arrays, plus n*log(n) array lookups. This means about O(n*log(n)) time, supposing you have a fast malloc.
Strange:
Creating a hash of n elements, plus n*log(n) hash lookups. This means about O(n*log(n)*log(n)) time.

What I'd like to know now is whether my variant of the Schwartzian transform is generally faster than the original one, or does it happen only in this case? It's only a 15% gain here (Update: even fewer with the larger data set), so it might not mean anything.

Yet one more thing. I've just done some more benchmarking. I've added some other variants in creating the hash:

sub sort_new { my %h; @h{@all} = map {-s $_} @all; @results = sort { $h{$a}<=>$h{$b} } @all; } sub sort_new2 { my %h = map {$_, -s $_} @all; @results = sort { $h{$a}<=>$h{$b} } @all; } sub sort_new3 { my %h; keys %h = @all; @h{@all} = map {-s $_} @all; @results = sort { $h{$a}<=>$h{$b} } @all; } sub sort_new4 { my %h; keys %h = @all; %h = map {$_, -s $_} @all; @results = sort { $h{$a}<=>$h{$b} } @all; }
I am somewhat surprised on the results, I'd have thought that method 2 would be faster than method 1 but no:
/bin/ contains 172 files Benchmark: timing 250 iterations of Ordinary, Schwartzian, Strange, St +range2, Strange3, Strange4... Ordinary: 4 wallclock secs ( 1.14 usr + 2.05 sys = 3.19 CPU) @ 78 +.37/s (n=250) Schwartzian: 2 wallclock secs ( 1.68 usr + 0.18 sys = 1.86 CPU) @ 1 +34.41/s (n=250) Strange: 1 wallclock secs ( 1.18 usr + 0.14 sys = 1.32 CPU) @ 18 +9.39/s (n=250) Strange2: 2 wallclock secs ( 1.32 usr + 0.16 sys = 1.48 CPU) @ 16 +8.92/s (n=250) Strange3: 1 wallclock secs ( 1.14 usr + 0.20 sys = 1.34 CPU) @ 18 +6.57/s (n=250) Strange4: 2 wallclock secs ( 1.53 usr + 0.14 sys = 1.67 CPU) @ 14 +9.70/s (n=250) /usr/bin/ contains 1397 files Benchmark: timing 250 iterations of Ordinary, Schwartzian, Strange, St +range2, Strange3, Strange4... Ordinary: 43 wallclock secs (15.27 usr + 26.38 sys = 41.65 CPU) @ 6 +.00/s (n=250) Schwartzian: 20 wallclock secs (17.37 usr + 2.32 sys = 19.69 CPU) @ 1 +2.70/s (n=250) Strange: 17 wallclock secs (14.22 usr + 1.88 sys = 16.10 CPU) @ 15 +.53/s (n=250) Strange2: 18 wallclock secs (15.75 usr + 1.91 sys = 17.66 CPU) @ 14 +.16/s (n=250) Strange3: 16 wallclock secs (14.01 usr + 2.05 sys = 16.06 CPU) @ 15 +.57/s (n=250) Strange4: 19 wallclock secs (17.04 usr + 2.08 sys = 19.12 CPU) @ 13 +.08/s (n=250) /usr/share/man/man3 contains 3859 files # but 2000+ of these are less +than 100 bytes long Benchmark: timing 250 iterations of Ordinary, Schwartzian, Strange, St +range2, Strange3, Strange4... Ordinary: 150 wallclock secs (49.90 usr + 98.38 sys = 148.28 CPU) @ + 1.69/s (n=250) Schwartzian: 65 wallclock secs (55.72 usr + 8.43 sys = 64.15 CPU) @ +3.90/s (n=250) Strange: 61 wallclock secs (53.28 usr + 7.06 sys = 60.34 CPU) @ 4 +.14/s (n=250) Strange2: 66 wallclock secs (57.68 usr + 7.76 sys = 65.44 CPU) @ 3 +.82/s (n=250) Strange3: 60 wallclock secs (53.29 usr + 7.03 sys = 60.32 CPU) @ 4 +.14/s (n=250) Strange4: 71 wallclock secs (62.48 usr + 7.83 sys = 70.31 CPU) @ 3 +.56/s (n=250)

Replies are listed 'Best First'.
Re^4: Benchmark, -s versus schwartzian (vs hash)
by Aristotle (Chancellor) on Aug 23, 2004 at 11:04 UTC

      The node you've mentioned actually uses an even faster form of the Schwartzian transform I've completely forgot about.

      Thus, I've added it to my benchmark too:

      sub sort_arr { my @h = map -s, @all; @results = @all[sort {$h[$a]<=>$h[$b]} 0..@h-1]; }

      The results are amazing: this form is about 1.5 times as efficent than any of the other ones.

        Yes, because it uses an array, and only one array.

        What's even more important to me is that it's also decidely easier on memory. Every single array in Perl has a minimum overhead of about 100 bytes without even counting the individual scalars inside it (and the one holding the reference to it..). When you're trying to sort a couple dozen million elements with a Schwartzian transform, as I've had to, that adds up alarmingly. In comparison, this method is basically free in terms of memory cost.

        Nitpick: you should really say 0 .. $#h when that's what you mean, as is the case here.

        Makeshifts last the longest.