andreas1234567 has asked for the wisdom of the Perl Monks concerning the following question:

I'm playing with Data::PowerSet. It creates all subsets of a list of elements. Starting with a set @a = 'A' .. 'E', I want to keep a subset @b of @a if and only if the elements of @b are adjacent in @a. In the example below, I want to keep the values marked with stars only. Is there a module like List::Util or List::MoreUtils that can help?
use strict; use warnings; my @a = 'A' .. 'E'; use Data::PowerSet 'powerset'; my $sets = powerset({ min => 3 }, @a); foreach my $b (@$sets) { print join('', @{$b}); } __END__ ABCDE * BCDE * ACDE CDE * ABDE BDE ADE ABCE BCE ACE ABE ABCD * BCD * ACD ABD ABC *
In this example the elements of @a are single letters, but I see no reason why they can't be multi-letter strings.
--
No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]

Replies are listed 'Best First'.
Re: Subsets and adjacent values
by grinder (Bishop) on May 30, 2008 at 13:59 UTC

    As much as I am chuffed that you're using a module I wrote, I'm afraid there's a much easier way of doing this. Just set down a marker at the beginnning of the list, move the right marker down as far as the shortest length of subset you want, and then start pushing out out the subsets as you move to the end of the string. When you get to the end, move the beginning marker one to right, and repeat while you can continue to squeeze out a subset. Then stop.

    Programmatically, this gives:

    my @set = ('A' .. 'Z', 0 .. 3); my $min = 3; my @result; for my $begin (0 .. $#set-$min+1) { for my $end ($begin+$min-1 .. $#set) { push @result, "@set[$begin .. $end]"; } } { local $" = ''; # interpolate an array without spaces print "$_\n" for @result; }

    This will be really, really fast. If you try your approach on a powerset of 30 elements your program will still be running after the heat-death of the universe.

    update: oops, BrowkerUK beat me to it. That'll teach me to get side-tracked into a conversation in real life and leave the form in preview mode. Still, it's interesting to compare the two techniques: he chose start postition and length, I chose start and end positions. Pick your poison :)

    • another intruder with the mooring in the heart of the Perl

      Actually, I tried to patch Data::PowerSet with an option to create adjacent output only:
      $ diff -ubB PowerSet.pm.org PowerSet.pm --- PowerSet.pm.org 2008-05-30 11:22:31.000000000 +0200 +++ PowerSet.pm 2008-05-30 12:46:32.000000000 +0200 @@ -9,7 +9,7 @@ use Exporter; use vars qw/$VERSION @ISA @EXPORT_OK/; -$VERSION = '0.05'; +$VERSION = '0.06'; @ISA = ('Exporter'); =head1 NAME @@ -18,8 +18,8 @@ =head1 VERSION -This document describes version 0.05 of Data::PowerSet, released -2008-05-13. +This document describes version 0.06 of Data::PowerSet, released +2008-05-30. =head1 SYNOPSIS @@ -161,6 +161,22 @@ When this attribute is used, the C<next()> method will return a scalar rather than a reference to an array. +=item B<adjacent> + +When this attribute is used, the returned list will contain adjacent +elements only. E.g. + + my $ps = Data::PowerSet->new( {min=>3, adjacent=>1}, 2, 3, 5, 8, 11 + ); + +will produce + + 2, 3, 5, 8, 11 + 2, 3, 5, 8 + 3, 5, 8, 11 + 2, 3, 5 + 3, 5, 8 + 5, 8, 11 + =back =cut @@ -195,6 +211,13 @@ ($args{min}, $args{max}) = ($args{max}, $args{min}) if $args{max} < $args{min}; + + $args{adjacent} = + exists $args{adjacent} + ? $args{adjacent} + : 0 + ; + return bless \%args, $class; } @@ -217,9 +240,18 @@ return undef unless $self->{current} >= 0; my $mask = $self->{current}--; my $offset = 0; + my $last = -1; @set = (); while( $mask ) { + if ($self->{adjacent}) { + if ($mask & 1 && (($offset-1 == $last) || sca +lar(@set) == 0)) { + $last = $offset; + push @set, $self->{data}[$offset]; + } + } else { push @set, $self->{data}[$offset] if $mask & 1; + } + $mask >>= 1; ++$offset; }
      However, it produces one element twice, and I can't figure out why.
      --
      No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]
Re: Subsets and adjacent values
by BrowserUk (Patriarch) on May 30, 2008 at 12:00 UTC

    It would be far better to just produce the sets you want than to produce more than you want and then throw some away. The sets you are after are quite easily produced iteratively with a couple of loops:

    After the fog cleared:

    #! perl -slw use strict; sub adjSets{ my $min = shift; map { my $start = $_; map { [ @_[ $start .. $start+$_ ] ] } $min-1 .. $#_-$start; } 0 .. $#_ - $min; } sub adjSets2{ my $min = shift; my @rv; for my $start ( 0 .. $#_ - $min ) { for my $len ( $min-1 .. $#_-$start ) { push @rv, [ @_[ $start .. $start+$len ] ]; } } return @rv; } print "@$_" for adjSets( 2, 'A'..'E' );; __END__ c:\test>689189 A B A B C A B C D A B C D E B C B C D B C D E C D C D E

    Update: Ignore below. T'was too early in my day, so just downvote this. Comments unnecessary :)

    Early morning brainfart


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Thanks a lot. However, it seems like it misses the last iteration (but that is easily fixed):
      < } 0 .. $#_ - $min; --- > } 0 .. $#_ - $min + 1;
      --
      No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]
Re: Subsets and adjacent values
by pc88mxer (Vicar) on May 30, 2008 at 14:53 UTC
    I want to keep a subset @b of @a if and only if the elements of @b are adjacent in @a.
    This is a equivalent to saying that @b is a slice of @a - at least that what it seems like from your example. As grinder pointed out, to generate slices you just need to pick a starting point and a length.

    However, I can think of another possible interpretation of 'adjacent' which allows the sequences BA and CBA, etc.

Re: Subsets and adjacent values
by blazar (Canon) on May 30, 2008 at 15:32 UTC
    I'm playing with Data::PowerSet. It creates all subsets of a list of elements.

    I personally believe that before proceeding you should clarify your terms, especially those I highligheted above, since Mathematics is based upon precise definitions. You start talking about sets, and subsets, then mix them arbitrarily with lists. But sets and lists are different mathematical objects, albeit relatively similar ones. For sets, you cannot speak of "adjacency" while for lists you can.

    Now, unless I'm grossly mistaken, it seems to me that your problem may be translated into an equivalent but simpler one, namely that of finding all of the substrings of a given string, perhaps with a constraint on their lenght, as per your example. HTH.

    --
    If you can't understand the incipit, then please check the IPB Campaign.
      There is a little confusion of terms going on here. Mathematically a set is an un-ordered collection. However, the powerset() routine in Data::PowerSet returns arrays which are ordered.

      What powerset() is really doing is producing all the sub-sequences (with gaps between elements allowed) of the input sequence. If your input sequence contains distinct elements, then you can think of it as being a set and think of all of the generated sub-sequences as subsets. That is, there is a correspondence between a sequence of distinct elements and the set consisting of those elements.

      So, the powerset() routine can be used to produce either 1) all the sub-sequences of a given sequence or 2) all the subsets of a given set.

        I personally believe all this is evident and obvious, but... I can't understand why you're telling me! I knew. Still stands the point I tried to make to the OP: namely, that he should carefully choose the frameset for the problem. He reasoned in terms of sets: that these are "implemented" (please bear with me being slightly sloppy wrt terminology) with lists in the referenced module was considered a side effect to be used afterwards, which suggested him an overkill of an algorithm, while if he had resoned in terms of lists in the first place, he would have probably chosen a saner one to start with.

        --
        If you can't understand the incipit, then please check the IPB Campaign.
      You start talking about sets, and subsets, then mix them arbitrarily with lists. But sets and lists are different mathematical objects, albeit relatively similar ones.
      Good point. As the OP I humbly apologize and take full responsibility for the confusion that followed the initial post.
      --
      No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]