jperlq has asked for the wisdom of the Perl Monks concerning the following question:

Hello,
Perl sorting, here is the task.

11_22   A
22_45   B
22_33   C
25_28   D
47_49   E
...

I want to group the A,B,C,D such that the ranges listed in the first column are grouped together so B contains the set of B->{C, D}, C->{D}, A->{} etc.

I am looking for a way to sort and group by ranges.

Replies are listed 'Best First'.
Re: sorting and grouping by
by Joost (Canon) on Mar 13, 2008 at 00:43 UTC
    Since you haven't defined HOW you want to sort the ranges I will ignore it. I'll also ignore how to read that data.
    #!perl -w use strict; use Data::Dumper 'Dumper'; my %ranges = ( a => [11,22], b => [22,45], # etc ); my %r2 = %ranges; # need copy to reentrantly iterate over %ranges # perl sucks sometimes. my %results; while (my ($name,$range) = each %ranges) { while (my ($name2,$range2) = each %r2) { if ($name ne $name2 and $range->[0] <= $range2->[0] && $range->[1 +] >= $range2->[0]) { push @{$results{$name}},$name2; } } $results{$name} ||= []; } print Dumper(\%results);
      Thank you
      The copying of the hash and while inside a while worked.
Re: sorting and grouping by
by NetWallah (Canon) on Mar 13, 2008 at 01:12 UTC
    Here is an OO approach - makes the code a little more readable
    #!/usr/bin/perl -w use strict; use Data::Dumper 'Dumper'; { package Range; sub new{ my $class = shift; return bless {map { $_=> shift} qw[NAME START FIN]}, $class; } sub contains{ my ($self, $other) = @_; return $self->{START} <= $other->{START} && $self->{FIN} >= $other->{FIN} ; } sub add_subrange{ my ($self, $other) = @_; push @{$self->{SUBRANGES}}, $other; } 1 } #--- Main code --- my @ranges = ( new Range(qw[A 11 22]), new Range(qw[B 22 45]), new Range(qw[C 22 33]), new Range(qw[D 25 28]), new Range(qw[E 47 49]), ); for my $r1 (@ranges){ for my $r2(@ranges){ next if $r1 == $r2; $r1->add_subrange($r2) if $r1->contains($r2); } print Dumper \$r1; }
    Result:
    $VAR1 = \bless( { 'NAME' => 'A', 'START' => 11, 'FIN' => 22 }, 'Range' ); $VAR1 = \bless( { 'NAME' => 'B', 'START' => 22, 'SUBRANGES' => [ bless( { 'NAME' => 'C', 'START' => 22, 'FIN' => 33 }, 'Range' ), bless( { 'NAME' => 'D', 'START' => 25, 'FIN' => 28 }, 'Range' ) ], 'FIN' => 45 }, 'Range' ); $VAR1 = \bless( { 'NAME' => 'C', 'START' => 22, 'SUBRANGES' => [ bless( { 'NAME' => 'D', 'START' => 25, 'FIN' => 28 }, 'Range' ) ], 'FIN' => 33 }, 'Range' ); $VAR1 = \bless( { 'NAME' => 'D', 'START' => 25, 'FIN' => 28 }, 'Range' ); $VAR1 = \bless( { 'NAME' => 'E', 'START' => 47, 'FIN' => 49 }, 'Range' );

         "As you get older three things happen. The first is your memory goes, and I can't remember the other two... " - Sir Norman Wisdom

Re: sorting and grouping by
by apl (Monsignor) on Mar 13, 2008 at 01:14 UTC
    You haven't defined what you mean by sorting, or grouping. You haven't explained what -> means relative to set theory. Your definition of what comprises a set is inconsistant. (A is empty?)

    Could you provide a little more detail please?

      First, in my defense, it looks like Joost had some confusion about what you meant as well.

      Second, A intersects B at 22, while B intersects C at ( 22 .. 33 ), and wholly contains D. The statement

      B contains the set of B->{C, D}, C->{D}, A->{} etc.
      seems to differentiate '->' from 'contains', as well as using two different ways of defining a set ( '{}', and 'the set').

      In any event, I'm glad you got the help you needed.

      -> seems to mean "has for subsets", since B->{C, D} seems to mean "C" and "D" are wholly contained in "B".
    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: sorting and grouping by
by rg0now (Chaplain) on Mar 13, 2008 at 21:25 UTC
    Here is some (admittedly, very inmature) Haskell to solve this problem.
    type Data = (Int, Int, String) bottom (b, _, _) = b top (_, t, _) = t ide (_, _, i) = i contains :: Data -> Data -> Bool x `contains` y = bottom x <= bottom y && top x >= top y rangeSort1 :: [Data] -> [(Data, [Data])] rangeSort1 r = map (\x -> (x, (filter (contains x) (filter (/=x) r)))) + r ---- test :: [Data] test = [ (11, 22, "A") , (22, 45, "B") , (22, 33, "C") , (25, 28, "D") , (47, 49, "E") ]
    And here is the corresponding Perl, or at least, this is the closest I could get to this 'declarational' style in Perl:
    sub bottom { shift->[0] }; sub top { shift->[1] }; sub ide { shift->[2] }; sub contains { bottom($_[0]) <= bottom($_[1]) && top($_[0]) >= top($_[ +1]) }; sub rangeSort1 { map { my $x = $_; [$x, [grep {contains($x, $_)} (grep { ide($_) ne ide($x) } @_) ] ] } @_ }; my $test = [ [11, 22, "A"] , [22, 45, "B"] , [22, 33, "C"] , [25, 28, "D"] , [47, 49, "E"] ];
    If there's a Haskell/Perl guru out there with some free time, could you please improve on this?

    Update: slightly improved Haskell:

    rangeSort1 :: [Data] -> [(Data, [Data])] rangeSort1 r = mapMaybe (\x -> let l = filter (contains x) . filter (/ +=x) in case l r of [] -> Nothing otherwise -> Just (x, l r) ) r