Re^3: Clustering Numbers with Overlapping Members

OK, if you want this behaviour, then the following code - slightly modified from my previous posting - should work.

use warnings;
use strict;

my @nlist = (0,0,1,2,3,3,4,5,6,8,8,10);
my @key_list = ('A'..'Z');
my $tolerance = 1;
my %hoa;

my %uniq;
@uniq{@nlist[1..$#nlist]} = ();

for my $centroid (sort {$a <=> $b} keys %uniq) {
    my $key = shift @key_list;
    $hoa{$key} = [grep in_range($centroid, $_), @nlist ];
}

print "$_ => [@{$hoa{$_}}]\n" for sort keys %hoa;

sub in_range
{
    my ($centroid, $testnum) = @_;
    return abs($centroid - $testnum) <= $tolerance;
}
[download]

The idea is to iterate over the (unique) centroids - ignoring the very first element in @nlist - and extract all numbers 'in_range' from the original array.

Update: Small bugfix.

-- Hofmator

Comment on Re^3: Clustering Numbers with Overlapping Members Select or Download Code

Replies are listed 'Best First'.
Re^4: Clustering Numbers with Overlapping Members by monkfan (Curate) on Aug 07, 2006 at 14:35 UTC
Hofmator, One more small thing. Hope you won't mind to look at it. I should add that when the very first element doesn't have its neighbour then it forms another cluster. In other words we ignore the first element only when it has neighbour within tolerance. So for example: `my @nlist = ( 2,4,5,6,7 ); my $tolerance = 1; We would like to have: A => [2] B => [4 5] C => [4 5 6] D => [5 6 7] E => [6 7]` [download] How can I modify your code to accomodate this? Update: I think I got it. #my @nlist = (0,0,1,2,3,3,4,5,6,8,8,10); #my @nlist = (0,1,2,3,4,5,6,8,10); my @nlist = (2,4,5,6,7); my @key_list = ('A'..'Z'); my $tolerance = 1; my %hoa; my %uniq; # Check if first element has a neighbour if ( felem_has_nbr( $nlist[0], $nlist[1],$tolerance ) == 1 ) { @uniq{ @nlist[ 1 .. $#nlist ] } = (); } else { @uniq{ @nlist[ 0 .. $#nlist ] } = (); } for my $centroid ( sort { $a <=> $b } keys %uniq ) { my $key = shift @key_list; $hoa{$key} = [ grep in_range( $centroid, $_ ), @nlist ]; } print "$_ => [@{$hoa{$_}}]\n" for sort keys %hoa; sub in_range { my ( $centroid, $testnum ) = @_; return abs( $centroid - $testnum ) <= $tolerance; } sub felem_has_nbr { my ( $felem, $sec_in_arr, $tl ) = @_; abs( $felem - $sec_in_arr ) <= $tl ? return 1 : return 0; } [download] Regards, Edward	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^4: Clustering Numbers with Overlapping Members
by monkfan (Curate) on Aug 07, 2006 at 14:35 UTC

Hofmator

when the very first element doesn't have its neighbour then it forms another cluster

only when it has neighbour within tolerance

my @nlist     = ( 2,4,5,6,7 );
my $tolerance = 1;

We would like to have:

A => [2]
B => [4 5]
C => [4 5 6]
D => [5 6 7]
E => [6 7]
[download]

Update:

#my @nlist = (0,0,1,2,3,3,4,5,6,8,8,10);
#my @nlist = (0,1,2,3,4,5,6,8,10);
my @nlist = (2,4,5,6,7);

my @key_list = ('A'..'Z');
my $tolerance = 1;
my %hoa;

my %uniq;

# Check if first element has a neighbour
if ( felem_has_nbr( $nlist[0], $nlist[1],$tolerance ) == 1 ) {
    @uniq{ @nlist[ 1 .. $#nlist ] } = ();

}
else {
    @uniq{ @nlist[ 0 .. $#nlist ] } = ();
}


for my $centroid ( sort { $a <=> $b } keys %uniq ) {
    my $key = shift @key_list;
    $hoa{$key} = [ grep in_range( $centroid, $_ ), @nlist ];
}

print "$_ => [@{$hoa{$_}}]\n" for sort keys %hoa;

sub in_range {
    my ( $centroid, $testnum ) = @_;
    return abs( $centroid - $testnum ) <= $tolerance;
}


sub felem_has_nbr {

    my ( $felem, $sec_in_arr, $tl ) = @_;
    abs( $felem - $sec_in_arr ) <= $tl ? return 1 : return 0;

}
[download]

Regards,
Edward

[reply]
[d/l]
[select]