Re: mapping lists
by tall_man (Parson) on Jan 24, 2003 at 20:50 UTC
|
Here's a quick example showing your problem can be done using a binary search, even though the "25" is not in the list.
Note: my version returns only elements actually in the list. If you want all the integers in between, just return $F[$lo_pos]..$F[$hi_pos].
Update: It now appears that the index positions are wanted instead. No problem, that's just $lo_pos..$hi_pos.
use strict;
use Search::Binary;
my $lastIndex;
sub reader {
my ($handle, $val, $index) = @_;
if (!defined $index) {
$lastIndex++;
} else {
$lastIndex = $index;
}
return ($val <=> $handle->[$lastIndex], $lastIndex);
}
my @F = (
1,
2,
4,
8,
16,
32..64
);
sub range{
my($lo, $hi) = @_;
my $lo_pos = binary_search(0,$#F,$lo,\&reader,\@F);
my $hi_pos = binary_search(0,$#F,$hi,\&reader,\@F);
$hi_pos = $#F if ($hi_pos > $#F);
return @F[$lo_pos..$hi_pos];
}
print join(',', range(25,35)), "\n";
print join(',', range(50,75)), "\n";
| [reply] [d/l] [select] |
|
|
my $lo_pos = binary_search(0,$#F,$lo,\&reader,\@F,2);
my $hi_pos = binary_search($lo_pos,$#F,$hi,\&reader,\@F,2);
I added a counter to the reader function and got the following output:
Did 6 probes for low
Did 6 probes for high
32,33,34,35
Did 6 probes for low
Did 7 probes for high
50,51,52,53,54,55,56,57,58,59,60,61,62,63,64
This is compared to a total of 78 probes the original way.
| [reply] [d/l] [select] |
Re: mapping lists
by thinker (Parson) on Jan 24, 2003 at 19:38 UTC
|
#!/usr/bin/perl -w
my(@F, @G);
@F = (
1,
2,
4,
8,
16,
32..64,
17
);
sub range{
my($lo, $hi) = @_;
@G = sort grep {$_ >= $lo and $_ <= $hi} @F;
return $G[0]..$G[$#G];
}
print join(',', range(25,35)), "\n";
print join(',', range(50,75)), "\n";
print join(',', range(12,35)), "\n";
this produces
32,33,34,35
50,51,52,53,54,55,56,57,58,59,60,61,62,63,64
16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35
hope this helps
thinker
Update Changed < to <= thanks belg4mit
Update2 Read the question properly, and return a range :-) | [reply] [d/l] [select] |
Re: mapping lists
by blokhead (Monsignor) on Jan 24, 2003 at 19:45 UTC
|
If the values in @F are very sparse, you should probably use a hash to make the reverse map instead of the array @G. If you had a case where @F = qw/1 3 5 1000000 2000000/;, you wouldn't want to reverse map that into an array, which would become 2 million elements large. Apart from that, it's just plain easier to write a reverse map into a hash: @G{@F} = ();
It's not clear to me from your description, but is the thing you really want the set intersection of the set @F with 25..35 and with 50..75? If this is the case, maybe one of the Set:: or Array:: modules might be worth inspecting (Set::IntRange?). Although I'm not familiar with what's out there, I would assume there would be some modules out there highly optimized for the intersection operation.
If you're not looking for a set intersection, I don't see any obvious optimizations here, other than the possible sparseness of the reverse map that I mentioned.
Update: I noticed your code always returns a range, even if some of the intermediaries in the range aren't in @F. thinker gives a very nice algorithm for set intersection with a range, but if you need to always return a range (even including elements not in @F), here's an adaptation:
sub range{
my($lo, $hi) = @_;
($lo, $hi) = (sort grep {$_ >= $lo and $_ <= $hi} @F)[0,-1];
return $lo..$hi;
}
Depending on the data, this might be better than your original code. While this has to process your entire set of values, if the values are extremely sparse, you may end up ++'ing $lo and --'ing $hi all day in your algorithm.
blokhead | [reply] [d/l] [select] |
|
|
I thought of using a hash, but how to handle a range wherein
the limits are not in the actual set was not obvious to me.
I guess I could use the same thing but use exists
en lieu of defined. But then again aren't hashes
notoriously suboptimal for numbers? (strings of a limited
set of characters)
Not interesections, trying to find the existing elements
within the defined range.
--
I'm not belgian but I play one on TV.
| [reply] |
Re: mapping lists
by tall_man (Parson) on Jan 24, 2003 at 19:52 UTC
|
If your lists are large, linear-search algorithms will be very slow. Your program and the one by thinker use linear searches. You can find points in a sorted list much faster with a binary search, which is O(log n). The Search::Binary module provides a generic implementation. (The learning curve for that module looks a bit steep, however.) | [reply] |
|
|
| [reply] |
|
|
use Search::Binary;
$pos = binary_search($min, $max, $val, $read, $handle, [$size]);
"binary_search" implements a generic binary search algorithm returning the position of the first record whose index value is greater than or equal to $val. | [reply] [d/l] |
Re: mapping lists
by BrowserUk (Patriarch) on Jan 24, 2003 at 23:32 UTC
|
#! perl -slw
use vars qw[$MAX $SIZE $LO $HI];
use strict;
$MAX ||= 100;
$SIZE ||= 10;
$LO ||= 25;
$HI ||= 50;
sub bsearch {
my ($ref, $val) = @_;
my ($lo, $hi) = (0, $#$ref);
while ($lo < $hi) {
my $i = ($lo+$hi)>>1;
my $bool = $val <=> $ref->[$i];
if ($bool < 0) { $hi = $i; }
elsif ($bool > 0) { $lo = $i + 1; }
else { return $i; }
}
return $hi;
}
sub range {
my ($ref, $lo, $hi) = @_;
my $lidx = bsearch($ref, $lo);
$lo = ($ref->[$lidx] > $lo and $lidx > 0) ? $ref->[--$lidx] : $
+ref->[$lidx];
my $hidx = bsearch($ref, $hi);
$hi = ($ref->[$hidx] < $hi and $hi < $#$ref) ? $ref->[$hidx++] : $
+ref->[$hidx];
$lo .. $hi;
}
my @F;
my ($start, $end) = (0,0);
#! Gen some test data
while ($start + $SIZE < $MAX) {
$start += int rand($SIZE);
$end = int $start + rand($SIZE);
push @F, $start .. $end;
$start = $end;
}
#! display it
print "@F";
#! and a range from it encompassing two (possibly absent values)
print "@{[ range(\@F, $LO, $HI) ]}";
__DATA__
C:\test>229675 -MAX=50 -SIZE=5 -LO=15 -HI=35
2 3 4 5 5 6 7 8 9 12 13 17 18 19 20 21 21 22 23 24 25 29 30 31 33 34 3
+6 37 40 41 42 43 44 45 46 47
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 3
+6
Examine what is said, not who speaks.
The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead. | [reply] [d/l] |
Re: mapping lists
by jdporter (Paladin) on Jan 24, 2003 at 21:07 UTC
|
| [reply] [d/l] [select] |
|
|
| [reply] |
Re: mapping lists
by l2kashe (Deacon) on Jan 24, 2003 at 19:54 UTC
|
Dont know if its faster, cause I didnt benchmark, but I like hashes a little better for this type of work, and for me its a little cleaner to write the same thing like so
%data = map { $_ => 1 } @F;
sub range{
my(@ret);
for ($_[0] .. $_[1]) {
push(@ret, $_) if ($data{$_});
}
return @ret;
}
Which produces the same output as your code... Critique any one?
Edit: I guess it would make more sense to initialize the data into a hash, as opposed to the array and then using map.
Also thinking about it I would probably pass a third arg being the ref to the data hash, I wanted to work on.
Edit_2: Doh that shoulda been defined in the loop :P.. o well.
/* And the Creator, against his better judgement, wrote man.c */ | [reply] [d/l] |
UPDATE: mapping lists
by belg4mit (Prior) on Jan 24, 2003 at 22:12 UTC
|
Doh! Sorry to have led so many on a wild goose
chase (thanks for all the replies though). But
I was so wrapped up in solving the problem of
the range limits not being part of the set that
I left out a HUGE set of the problem. So, consider
that a warmup.
Given the original @F(values are unique, and
are probably sorted) return the indices
of @F whose values fall within the supplied range.
UPDATE: Although I suppose all that would be
required then is to use one of these search
techniques (which?) and reverse map the values
of @F to the indices. Does this added step give
one of the techniques an advatage over the others?
--
I'm not belgian but I play one on TV.
| [reply] |
|
|
The values are probably sorted? Oh-oh.
If you aren't guaranteed they are sorted, you can't use a binary search. You'd have to do a linear pass to test if they are really sorted, and sorting them yourself isn't worth it unless you are going to do lots of range checks on the same data.
Here is a simple linear method that requires no side data structures.
use strict;
my @F = (
1,
2,
4,
8,
32..64,
26
);
sub range{
my($lo, $hi) = @_;
return grep {$F[$_] >= $lo and $F[$_] <= $hi} 0..$#F;
}
print join(',', range(25,35)), "\n";
This is basically what thinker and blokhead did except that my grep is on the indices instead of the values of @F.
| [reply] [d/l] |
|
|
I was thinking along the lines of:
#!/usr/bin/perl -w
use strict;
use Quantum::Superpositions;
my (@F, @G, @range);
@F = (1, 2, 4, 8, 16, 32..64);
my ($i, $j);
for($i=0; $i<scalar @F; $i++){
$G[$j++] = $i if (any(@range) == any($F[$i]));
}
print join(',', @G), "\n";
| [reply] [d/l] |