in reply to 2D binning

much more efficient would be to divide the coordinates length by the bin side and round that off and concatenate those together and they form a pretty good hash key(and you don't need to compute minimum or maximum any more). but here's a solution following your line of thought:
use strict; use warnings; use List::AllUtils qw/minmax/; my @points = ( [3.70722,0.54617,], [3.67342,0.64892,], [3.75017,0.85075,], [3.82831,0.83506,], [3.71852,0.79251,], [3.80706,0.70426,], [3.91839,0.94094,], [3.90726,0.81722,], [3.89007,0.76107,], [3.98006,0.62477,], ); my $bins; my $binside = 0.1; my ($xmin,$xmax) = minmax( map { $_->[0] } @points ); my ($ymin,$ymax) = minmax( map { $_->[1] } @points ); for( my $xbin=$xmin ; $xbin<=$xmax ; $xbin+= $binside) { for(my $ybin=$ymin ; $ybin<=$ymax ; $ybin+= $binside) { my $p = $points[0]; if( $p->[0] >= $xbin && $p->[0] < $xbin + $binside && $p->[1] >= $ybin && $p->[1] < $ybin + $binside ) { $bins->{$xbin.$ybin} = [] unless($bins->{$xbin.$ybin}); push @{ $bins->{$xbin.$ybin} } , (shift @points); }; } } print scalar(keys(%$bins)); 4
this sounds a lot like the bin packing problem.

Replies are listed 'Best First'.
Re^2: 2D binning
by austinby (Initiate) on Feb 19, 2010 at 13:32 UTC
    Thanks a lot for your input monks,
    I'll test all your ideas and send updates.
    Austin-
      Hi perlmonks,
      I tested the code sent by spx2 and BrowserUk after installing the List::AllUtils and Data::Dump modules.
      I get "4" as output from spx2's code.
      #!/usr/bin/perl -slw use strict; use warnings; use List::AllUtils qw/minmax/; my @points = ( [3.70722,0.54617,], [3.67342,0.64892,], [3.75017,0.85075,], [3.82831,0.83506,], [3.71852,0.79251,], [3.80706,0.70426,], [3.91839,0.94094,], [3.90726,0.81722,], [3.89007,0.76107,], [3.98006,0.62477,], ); my $bins; my $binside = 0.1; my ($xmin,$xmax) = minmax( map { $_->[0] } @points ); my ($ymin,$ymax) = minmax( map { $_->[1] } @points ); for( my $xbin=$xmin ; $xbin<=$xmax ; $xbin+= $binside) { for(my $ybin=$ymin ; $ybin<=$ymax ; $ybin+= $binside) { my $p = $points[0]; if( $p->[0] >= $xbin && $p->[0] < $xbin + $binside && $p->[1] >= $ybin && $p->[1] < $ybin + $binside ) { $bins->{$xbin.$ybin} = [] unless($bins->{$xbin.$ybin}); push @{ $bins->{$xbin.$ybin} } , (shift @points); }; } } print scalar(keys(%$bins));
      Not sure what the "4" means.

      I get a "Global symbol "$Y" requires explicit package name at line " error with BrowserUk's code. When I change the positions of $Y and $X, I get a "Global symbol "$X" requires explicit package name at line "
      #!/usr/bin/perl -slw use strict; use List::Util qw[ sum ]; use Data::Dump qw[ pp ]; our $Y //= 10; our $X //= 10; my @xs = map -3+rand 6, 1 .. $X; my @ys = map -3+rand 6, 1 .. $Y; my %bins; for my $x ( @xs ) { my $xc = ( int( $x * 10 ) + 0.5 ) / 10; for my $y ( @ys ) { my $yc = ( int( $y * 10 ) + 0.5 ) / 10; ++$bins{ "$xc:$yc" }; } } pp \%bins; printf "Buckets used: %d, total values %d \n", scalar keys %bins, sum values %bins;
      There's a script online that does exactly what I'm trying to do (http://homepages.ulb.ac.be/~dgonze/SCRIPTS/PERL/histogram-2d.pl), but requires user-defined inputs. I'd like to do this such that all the user needs is the input file.

        You must be using a pre-5.10 version of Perl. Try:

        #!/usr/bin/perl -slw use strict; use Data::Dump qw[ pp ]; our $Y ||= 10; ## changed from //= for pre-5.10 our $X ||= 10; my @xs = map -3+rand 6, 1 .. $X; my @ys = map -3+rand 6, 1 .. $Y; my %bins; for my $x ( @xs ) { my $xc = ( int( $x * 10 ) + 0.5 ) / 10; for my $y ( @ys ) { my $yc = ( int( $y * 10 ) + 0.5 ) / 10; ++$bins{ "$xc:$yc" }; } } pp \%bins; printf "Buckets used: %d, total values %d \n", scalar keys %bins, sum values %bins;

        You can vary the number of datapoints generated by using a command line like:

        perl -s theScript.pl -X=100 -Y=100

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        4 is the number of bins. do you need the contents of the bins ? also , the code at the link you indicated is ancient Perl4 code, you don't want to use that.