in reply to Re^2: better union of sets algorithm?
in thread better union of sets algorithm?
You can do slightly better than what you just suggested for a mergesort of k lists (when k is not super-small).Yes, but it's much harder to code off the top of my head. :)
What I fail to understand is why you say that perrin's hash algorithm takes O(n log n) time. No, a hash dereferencing operation takes amortized constant time, and you are just doing a dereference for each element in each list. That's a linear time algorithm.My understanding has always been that hash operations over n items take lg n time, since in general a hash will divide up n items into n / lg n buckets, then do a linear search through the appropriate bucket.
Some experiments seem to bear this out: as I increase the number of elements in a hash, the number of operations per second I can do decreases. This wouldn't happen if hash operations took constant time.
#!/usr/bin/perl use warnings; use strict; use Time::HiRes qw(time); my %h; foreach my $size (100_000, 200_000, 400_000, 800_000, 1_600_000, 3_200 +_000) { my %h; my($start,$end); # Test insert times $start = time; foreach my $i (1..$size) { $h{$i} = $i; } $end = time; my $insert_time = $end - $start; # Now test lookup times $start = time; foreach my $i (1..$size) { $h{$i} == $i or die "what the hell?"; } $end = time; my $lookup_time = $end - $start; printf "%10d: %5.3f inserts/sec, %5.3f lookups/sec\n", $size, $size/$insert_time, $size/$lookup_time; }
100000: 446368.754 inserts/sec, 1050330.051 lookups/sec 200000: 424977.633 inserts/sec, 821176.759 lookups/sec 400000: 398651.394 inserts/sec, 805369.896 lookups/sec 800000: 399463.754 inserts/sec, 787853.768 lookups/sec 1600000: 390381.775 inserts/sec, 766870.015 lookups/sec 3200000: 377860.715 inserts/sec, 720909.739 lookups/sec
Then again, my recollections of algorithmic amortizations are hazy at best, so feel free to convince me that I'm wrong. :)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: better union of sets algorithm?
by demerphq (Chancellor) on Mar 11, 2005 at 08:25 UTC | |
by Anonymous Monk on Mar 11, 2005 at 10:17 UTC | |
by demerphq (Chancellor) on Mar 11, 2005 at 10:30 UTC | |
|
Re^4: better union of sets algorithm?
by blokhead (Monsignor) on Mar 11, 2005 at 07:22 UTC | |
|
Re^4: better union of sets algorithm?
by blazar (Canon) on Mar 11, 2005 at 09:09 UTC | |
|
Re^4: better union of sets algorithm?
by Anonymous Monk on Mar 11, 2005 at 10:49 UTC |