Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

reduce like iterators

by LanX (Saint)
on Jan 03, 2011 at 17:57 UTC ( [id://880228]=perlquestion: print w/replies, xml ) Need Help??

LanX has asked for the wisdom of the Perl Monks concerning the following question:

Hi

First I thougt about writing a longer meditation about iterators and semipredicate problem, but maybe a simple seek for perl wisdom is a better start.

In gnu.emacs.help someone asked

How would you code this simple list compression problem in Ruby:

1.08 (**) Eliminate consecutive duplicates of list elements. If a list contains repeated elements they should be replaced with a single copy of the element. The order of the elements should not be changed.

Example:

?- compress([a,a,a,a,b,c,c,a,a,d,e,e,e,e],X). X = [a,b,c,a,d,e]

my first idea to solve it in perl was grep

DB<1> my $p; print grep { $p = $_ if $_ ne $p } (a,a,a,a,b,c,c, +a,a,d,e,e,e,e); abcade

but this approach fails for false list elements like e.g "0"!

A nice exmaple where a grep like map is better than the plain grep

DB<2> my $p; print map { $p ne $_ ? $p = $_ : () } (a,a,a,a,b,c,c,a +,a,d,e,e,e,e) abcade

But beside the need to lexicalize $p in the outer scope ...

using state wouldn't be more elegant :

DB<3> use feature "state"; print map { state $p; $p ne $_ ? $p = $_ : + () } (a,a,a,a,b,c,c,a,a,d,e,e,e,e) abcade

... it's still buggy if the compressed list was starting with undef values.

Thats a variation of the old Semipredicate problem:

"One can't find an initial value which is not also potentially a part of the iterated list!"

So I had a look into List::Util and the only appropriate approach was reduce

The first call will be with $a and $b set to the first two elements of the list, subsequent calls will be done by setting $a to the result of the previous call and $b to the next element in the list.

Looks perfect, but reduce {} LIST only returns a scalar and List::MoreUtils doesn't seem to provide anything better¹.

(Sure I could use $a as an array-ref for accumulation, not very elegant...)

Temporary conclusion:

There seem to be a lack of iterators allowing to compare successive elements

And now I'm wondering about the best design ...

could a special variable like $^PRE generally help augmenting existing iterators, without the need to define and appropiately name a whole new family of reduce-like iterators?

( IMHO too many specialized functions like in LISP are hard to remember.)

something like

print grep { $_ ne $^PRE } (a,a,a,a,b,c,c,a,a,d,e,e,e,e); abcade

Thoughts?

Cheers Rolf

¹) couldn't find much more on CPAN...

Replies are listed 'Best First'.
Re: reduce like iterators
by ELISHEVA (Prior) on Jan 03, 2011 at 18:27 UTC

    This is fairly easy to implement and wrap in a sub, even with the undefs. Once the sub is written you are back to having a one-liner. What would you like that this wouldn't do?

    use strict; use warnings; sub compress { my $x; map { if (!defined($x)) { defined($_) ? ($x = $_) : () } else { defined($_) && ($x eq $_) ? () : ($x = $_) } } @_; } my @aData=(qw(a a a a b c c a a d e e e e), undef, undef, qw(f g g)); my @aCompressed = compress @aData; print "compressed: @aCompressed\n"; # outputs a b c a d e undef f g

    Or if you want to play golf (though others I'm sure can do better)

    sub compress { my$x;map{defined($x)?(defined($_)&&($x eq$_)?():($x=$_) +):defined($_)?($x=$_):()}@_; }

    Update: added golf

    Update:: fixed mistake - undefs were strings.

      Sure, I also did a function like this in the emacs help group

      But I was asking about designing a generic iterator solution, opening a variety of applications and augmenting readablity

      already a reduce_list {BLOCK} LIST variant of reduce could be sufficient in this case

       reduce_list { $a ne $b } qw(a a a a b c c a a d e e e e undef undef f g g)

      another application would be Haskells group iterator which partitions into successive groups:

      group [1, 1, 1, 1, 2, 3, 3, 1, 1, 4, 5, 5, 5, 5] ==> [[1,1,1,1], [2], [3,3], [1,1], [4], [5,5,5,5]]

      Simpling allowing access to the previous iteration easily would help implementing this with part from List::MoreUtils

       part { $a eq $b } LIST

      or

       part { $^PRE eq $_ } LIST

      Without the need to invent and name a whole new group of iterators.

      Cheers Rolf

        Your reduce_list is named similarly to reduce, but bears no resemblance since it doesn't to reduce at all.

        • reduce allows an arbitrary state to be passed from one pass to another. Your reduce_list doesn't.

        • reduce can return any value, not just the input. reduce_list can return at most one scalar, and it can only be the input. (That's not very "listy"!)

        Because of those reasons, reduce is a general purpose function. (It can implement any other function in List::Util.) Your reduce_list is just grep with access to the last element.

        This accounts for the differences with what I suggested it should look like.

        Perl also lets you use parameters to roll your own block syntax. Here is a slightly different definition that would allow you to do arbitrary operations on the first element of each run in a list:

        Or if you want to be able to work with the current run value ($b) and the previous run value $a, you could do something like this:

        There is almost no limit to what you could create. For example, if you wanted to be able to group items, you could additionally track the number of items in each run and set $_ to number of items in the current run:

        Would that do what you want?

        Update: added yet another block iterator, this time one that could be used for generating groups as in the Haskell function above.

        Update: put in readmore tags

        Update: fixed 3rd example so it prints count of current (not previous) run.

Re: reduce like iterators
by BrowserUk (Patriarch) on Jan 03, 2011 at 18:49 UTC

    Compress() is easily defined in terms of a generic adjacent pairing routine:

    #! perl -slw use strict; sub adjacentPairs (&@) { my $code = shift; map { local( $a, $b ) = @_[ $_-1, $_ ]; $code->(); } 1 .. @_; } print adjacentPairs{ defined $b && $a eq $b ? () : $a } qw[a a a b c c a a d e e e e]; __END__ C:\test>compress abcade

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: reduce like iterators
by JavaFan (Canon) on Jan 03, 2011 at 19:05 UTC
    You have to be careful with any state solution. They'll work fine until you go through the same code path again (due to a loop, the sub it's in called again, etc). Being a state variable, it'll retain whatever was in it last. So you have to be careful you aren't comparing the last element of the first list with the first element of the second list.
    @cleaned = @uncleaned[0, grep {@uncleaned[$i] ne @uncleaned[$i-1]} 1.. +$#uncleaded];
    It needs some additional hackery to deal with undefs correctly.
      argh!!!

      I missed that:

      DB<3> use feature "state"; map { state $p; print $p;$p=$_} 1..3 12 DB<5> use feature "state"; for (1..3) {map { state $p; print $p;$p=$ +_} 1..3} 12312312

      thanks a lot! :)

      Cheers Rolf

        This particular problem is solved in Perl 6 by always cloning all blocks as closures the same way, so state is reset in loops as well as in map. (In fact, the for loop is defined in terms of map in Perl 6. So for loops return a list of values just like map does.) Perl 5 could probably move in the direction of fewer special-cased blocks that aren't true closures; this would have many subtle benefits. Not everything in Perl 6 can be borrowed back, but I suspect this is one of them.
Re: reduce like iterators
by moritz (Cardinal) on Jan 03, 2011 at 19:20 UTC
    I couldn't resist writing a "golfed" Perl 6 version, heavily inspired by functional programming techniques:
    sub compress(*@a) { flat @a Zxx 1, (@a Zne @a[1..*-1]) } say compress(<a a a a b c c a a d e e e e>).perl;

    The *-1 is a workaround for a rakudo bug, in a perfect Perl 6 implementation 1..* would work too.

Re: reduce like iterators
by NERDVANA (Deacon) on Jan 10, 2024 at 03:00 UTC
    This probably runs faster than anything with map or grep, and isn't much longer than the fully golfed ones.
    sub compress { my @x; !@x || $_ ne $x[-1] and push @x, $_ for @{$_[0]}; \@x }
Re: reduce like iterators
by tybalt89 (Monsignor) on Jan 10, 2024 at 21:49 UTC

    It's much simpler than the "semipredicate problem" because all you have to do is set the previous value to "not the first value".

    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=880228 use warnings; sub squeeze { no warnings; my $prev = not $_[0]; return grep +($_ ne $prev, $prev = $_)[0], @_; }
Re: reduce like iterators
by furry_marmot (Pilgrim) on Jan 04, 2011 at 01:16 UTC

    Hi LanX,

    map{} will pass on undef's, but grep{} needs undef to tell whether something passed the filter. But you can stack any number of maps and greps together, which is how I came up with a workaround for it, using two maps and a grep.

    The first map{}, reading from the bottom up, simply marks duplicates in-a-row (as opposed to any dupes at all) by changing them to 'aardvark'. Undef's and zeroes go through without problem. The second map{} changes actual undefined elements to the text 'undef'. Change these as suits your algorithm. The grep{} is where the duplicates/aardvarks are finally removed.

    It seems to work, accommodating undefined values, while still being simple enough to use in a one-liner, and uses array refs so it doesn't pass whole lists back and forth.

    use strict; my @orig = ( qw(a a b b c 0 c d d u u), (undef, undef, 'blink', 'blink'), qw(0 0 v v w w a a a b b b c c c) ); my @list; nodupes (\@orig, \@list); print join ' ', @list, "\n"; sub nodupes { my ($ar1, $ar2) = @_; my $p; push @$ar2, grep{$_ ne 'aardvark'} map{defined $_ ? $_ : 'undef'} map { $p ne $_ ? $p = $_ : 'aardvark' } @$ar1; } __END__ Prints --> a b c 0 c d u undef blink 0 v w a b c
    And the one-liner -- actually broken up for easier viewing:
    perl -e "@list = grep{$_ ne 'aardvark'} map{defined $_ ? $_ : 'undef'} map{$p ne $_ ? $p = $_ : 'aardvark'} (qw(a a b b c 0 c u u), (undef, undef), qw(0 0 v v w w)); print join ' ', @list;" Prints --> a b c 0 c u undef 0 v w

    I hope you find this interesting/useful.

    --marmot

    UPDATE: I was thinking about this some more, and realized I had made it waaaay too complicated. The nodupes() below pretty much is a one-liner, and accomodates 0, "0", and undef just fine. And nodupes() can be used inline with other maps and greps. In the end, the grep was the only thing needed.

    use strict; sub nodupes { my $p; return grep{ $_ ne '~~' } map { $p ne $_ ? $p = $_ : '~~' } @_; } my @orig = ( qw(a a b b c 0 c d d u u), (undef, undef, 'blink', 'blink'), qw(0 0 v v "0" "0" "0" w w a a a b b b c c c) ); my @new = nodupes @orig; print join ' ', @new, "\n"; __END__ Prints --> a b c 0 c d u blink 0 v "0" w a b c ^^ There's an undef between these two spaces.
      Any reason not to just do:
      my $p; my @new = grep { $p = $_ or 1 if $p ne $_ } @orig;
      ?
        Ummm....because I didn't think of that? :-) Thanks! You just expanded my understanding of what you can do with grep.

        --marmot

        Try @orig starting with undef.

        Also see the OP for the "semi-predicate problem" discussion.

        If this seems too theoretical for you, consider the practical task to do of a run-length encoding of sparsely set arrays. (undef is a real value)

        Cheers Rolf

Re: reduce like iterators
by Anonymous Monk on Jan 29, 2024 at 09:59 UTC
      > Since somebody revived this really old node

      well you did

      > mention there's a recent ticket

      created by you

      > Also found a ticket in the queue of List::MoreUtils, but the author claims the functionality already exists in List::Util ?!

      He's the maintainer, I think the authors abandoned it. And he most likely misunderstood the question.

      I already contacted him privately regarding the scalar behavior of slide, his answer was kind of indicating that he would pass on the maintainership if asked.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      see Wikisyntax for the Monastery

Re: reduce like iterators
by talexb (Chancellor) on Jan 11, 2024 at 14:34 UTC
    #!/usr/bin/perl # 2024-0111: From the node: "Eliminate consecutive duplicates of list # elements. If a list contains repeated elements they should be # replaced with a single copy of the element. The order of the # elements should not be changed." use strict; use warnings; { my @orig = qw/a a a a b c c a a d e e e e/; my @correct = qw/a b c a d e/; my @soln; for ( push ( @soln, shift @orig ); @orig; ) { my $val = shift @orig; if ( $soln[-1] ne $val ) { push ( @soln, $val ); } } print "Correct solution is @correct;\n"; print "My solution is @soln.\n"; }

    Because my background (before Perl) is in C, that automatically looks like a for loop problem to me. In other words, you need to prime the pump so that you're not forced into doing something Really Clever for the first element.

    And I could have written

    push ( @soln, $val ) if ( $soln[-1] ne $val );
    to make it more Perl-ish, but I'm not a fan of the postfix syntax. My OldSchool brain wants to see an if statement at the beginning of the line.

    And, yes, this assumes that the list is string values (or stringable values), and does not deal with the undef value. That reminds me of a Google interview that I had a while back ("What about this ridiculous limitation to the solution?" "And how about this even more insane exception?").

    Alex / talexb / Toronto

    Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

      push ( @soln, $val ) if ( $soln[-1] ne $val );

      The brackets around the arguments to push are unnecessary, even in the full-blown if-block you used. With postfix-if you can lose the other brackets too:

      push @soln, $val if $soln[-1] ne $val;

      I know some folks like the extra brackets but I find removing them aids clarity.


      🦛

        Sure, but then you're left with

          push @soln, $val if ( $soln[-1] ne $val );
        and with the comma suggesting do this thing, then do that other thing, it could read like this:
          push @soln # and then .. $val if ( $soln[-1] ne $val );
        Huh. That can't be right.

        You could also use indentation to make it clearer ..

          push @soln, $val if ( $soln[-1] ne $val );

        Anyway, it's a matter of taste, and my preferences comes from writing C in the 80's.

        Alex / talexb / Toronto

        Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

Re: reduce like iterators
by LanX (Saint) on Jan 15, 2024 at 17:56 UTC
    Hello younger me :)

    > and List::MoreUtils doesn't seem to provide anything better

    This changed in the meantime, slide was added, but there are still issues.

    The worst one is that it only returns scalars instead of lists like map does, which means we have to add an extra grep to filter undefined values :/

    use v5.12; use warnings; use List::MoreUtils qw/slide/; use Data::Dump; my @x= split',', q(a,a,a,a,b,c,c,a,a,d,e,e,e,e); dd grep defined, slide { $a ne $b ? $b : () } "", @x;

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

      > The worst one is that it only returns scalars instead of lists like map does, which means we have to add an extra grep to filter undefined values :/

      That's most likely a bug in the XS implementation of slide.

      I looked into the Pure Perl implementation in List::MoreUtils::PP, and it looked fine, it's literally using map

      And a test reveals, that it really works like it should.

      use v5.12; use warnings; use Data::Dumper; BEGIN { $ENV{LIST_MOREUTILS_PP} = 1 }; # enforce PP version, comment f +or XS use List::MoreUtils qw/slide/; my @x= split',', q(a,a,a,a,b,c,c,a,a,d,e,e,e,e); print Dumper slide { $a ne $b ? $b : () } "", @x;

        PP version

        $VAR1 = 'a'; $VAR2 = 'b'; $VAR3 = 'c'; $VAR4 = 'a'; $VAR5 = 'd'; $VAR6 = 'e';

        XS version

        $VAR1 = 'a'; $VAR2 = undef; $VAR3 = undef; $VAR4 = undef; $VAR5 = 'b'; $VAR6 = 'c'; $VAR7 = undef; $VAR8 = 'a'; $VAR9 = undef; $VAR10 = 'd'; $VAR11 = 'e'; $VAR12 = undef; $VAR13 = undef; $VAR14 = undef;

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      see Wikisyntax for the Monastery

        > That's most likely a bug in the XS implementation of slide.

        The word "bug" should be a link to the actual bug report.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: reduce like iterators
by ikegami (Patriarch) on Jan 03, 2011 at 18:51 UTC
    my @b = uniq @a; my %seen; my @b = grep !$seen{$_}++, @a;

    But if you really want to use reduce with just one expression, it's definitely possible.

    my @b = @{( reduce { push @{$a->[1]}, $b if !$a->[0]{$b}++ } [ {}, [] +], @a; $a )->[1]};
      read the example again, it's not about uniq but identifying sequences!

      update:

      DB<1> use List::MoreUtils "uniq"; print uniq qw(a a a a b c c a a d +e e e e) abcde

      Cheers Rolf

        Oops, adjusted:

        my @b = @{ reduce { push @$a, $b if !@$a || $b ne $a->[-1]; $a } [], @ +a };

        If one were to make a list version of reduce, the callback would need access to three variables: The list (say $_), the state (say $a), the current value (say $b). The problem could be solved as follows:

        my @b = list_reduce { push @$_, $b if !@$_ || $b ne $_->[-1]; undef } +undef, @a; my @b = list_reduce { push @$_, $b if $a || $b ne $_->[-1]; 0 } 1, @a; my @b = list_reduce { push @$_, $b if defined($b) && $b ne $a; $b } un +def, @a; my @b = list_reduce { push @$_, grep defined && $_ ne $a, $b; $b } und +ef, @a;

        (@a may not start with undef for the last two to work properly.)

        Update: Added everything after the first block of code.

Re: reduce like iterators
by Anonymous Monk on Jan 03, 2011 at 19:43 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://880228]
Approved by ELISHEVA
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-04-18 23:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found