comment on

I've put together some not very elegant code that seems to do pretty much what you want.

Concerning overlaps, as brought up by jdporter, your test data actually contains such an instance. Consider the sequence:

( 5900, 5990, 6000, 6402, 6450 )

If $interval is 500, how should this sequence be dealt with? For the purposes of my code, I interpreted your words "find all the groups that are within $interval of each other" literally, splitting this sequence into two "clusters", namely:

( 5900, 5990, 6000 ) and ( 5990, 6000, 6402, 6450 )

But the alternative solution offered in your follow-up post would be a lot easier to code. ;)

The code also deals with numbers appearing in multiple lists.

This is the Data::Dump output for the first roman numeral:

(
  "I",
  [
    [1, [4, 10, 200], 3, [0, 95], 2, [75, 350]],
    [1, [6000], 3, [5990], 2, [5900]],
    [1, [6000], 3, [5990, 6450], 2, [6402]],
    [1, [7000, 7100], 2, [7110]],
  ],
)
[download]

Oh, and by the way, you shouldn't use $a and $b as variable names, or strictures (you do use strict; don't you?) will complain when you do a sort:

Can't use "my $a" in sort comparison at path/to/prog.pl line 30.

Anyway, here's the program:

use strict;
use warnings;
use Data::Dump 'dump';

my $INTERVAL = 500;

# Don't know how you get your indices, so:
my ( $x, $y, $z ) = ( 1, 2, 3 );
my @indices = ( $x, $y, $z );

my %orig = (
  I => {
    $x => [ 4, 10, 200, 6000, 7000, 7100 ],
    $y => [ 75, 350, 5900, 6402, 7110 ],
    $z => [ 0, 95, 5990, 6450 ] },
  II => {
    $x => [ 0, 20, 501, 1000 ],
    $y => [ 22, 23, 24 ],
    $z => [ 22, 23, 500 ] },
);

my %outhash;

foreach my $key ( keys %orig ) {
  my ( @merged, @graded, @prev );

  foreach my $a_ref ( @indices ) {
    push @merged, map { $_ . '.' . $a_ref } @{ $orig{$key}{$a_ref} };
  }
  @merged = sort { $a <=> $b } @merged;

  foreach my $i ( 0 .. $#merged ) {
    my @cluster;
    foreach my $j ( $i .. $#merged ) {
      last if $merged[$j] - $merged[$i] > $INTERVAL;
      push @cluster, $merged[$j];
    }
    if ( not subset_of( \@prev, \@cluster ) ) {
      push @graded, [ @cluster ];
    }
    @prev = @cluster;
  }

  foreach my $a_ref ( @graded ) {
    my %HoA;
    for ( @{ $a_ref } ) {
      my ( $int, $idx ) = split /\./;
      push @{ $HoA{$idx} }, $int;
    }
    push @{ $outhash{$key} }, [ %HoA ];
  }
}

dump( %outhash );

sub subset_of {
  my %first = map { $_ => undef } @{ $_[0] };
  my @second = @{ $_[1] };
  for ( @second ) {
    return 0 unless exists $first{$_};
  }
  return 1;
}
[download]

In reply to Re: taking a subset of HoHoA, retaining relative structure? by Not_a_Number
in thread taking a subset of HoHoA, retaining relative structure? by mdunnbass

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.