comment on

In case not too many numbers cluster together around a few points, an alternative is to first form clusters of numbers where the numbers are at most 2 * $delta apart ($delta = 0.01 in your case) and then only process the candidate clusters to find numbers at most $delta apart.

For the following code there is only one pass through the input array at the beginning and then sorts of small arrays ("small" in case the assumption made at the beginning holds).

#maximum distance we are looking for
$delta = 0.01;
#test array
@a = (1.02, 1.03, 6.01, 9, 1.04, 1.011, 1.025, 1.01, 0.005, -0.002);

#"discretize" points to neighboring points, scale by 1/$delta 
#to simplify computation
for (@a) {
  push @{$h{int($_/$delta)}}, $_;
  push @{$h{int($_/$delta-1)}}, $_;
  push @{$h{int($_/$delta+1)}}, $_;
}

#handle clusters
for (keys %h) {
  #in case the corresponding array has more than one element, 
  #we know that it contains at least one pair not 
  #further apart than 2 * $delta, otherwise ignore it 
  if (@{$h{$_}} > 1) {
    @sorted = sort @{$h{$_}};
    for (0..@sorted-2) {
      $r = $sorted[$_];
      $s = $sorted[$_+1];
      #filter out neighboring pairs, since we do not need to 
      #process the numbers further, we cram them into a
      #string for the final output
      $near{"$r, $s"} = 1 if ($r < $s && $s <= $r + $delta);
    }
  }
}

print "Not further than $delta apart are the following pairs:\n";
print "$_\n" for (keys %near);
[download]

Output:

Not further than 0.01 apart are the following pairs:
-0.002, 0.005
1.02, 1.025
1.025, 1.03
1.011, 1.02
1.03, 1.04
1.01, 1.011
[download]

Update: Tested using the Benchmark module and a fixed array of 300000 numbers randomly distributed between 0 and 300, the near number determining part took about 8 seconds on my reasonably modern machine.

I.e. the benchmark test starts like this:

use strict;
use warnings;
use Benchmark;
#maximum distance we are looking for
my $delta = 0.01;
#test array
my @a;
for (1..300000) {
  push @a, rand() * 300;
}

my ($r, $s);

timethis ( 10 => 
  sub { ...
[download]

Update: Improved the description, it gave the impression we first look for all numbers in the array not more than 2 * $delta apart, which is not the case.

In reply to Re: fastest way to compare numerical strings? by jds17
in thread fastest way to compare numerical strings? by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.