Re: The Virtue of Laziness

Replying to myself for a summary, hope that's ok.

Preface: I'm not trying to solve any real world problems. Please don't take these thoughts too serious.

The questions asked in this thread lead me to a clearer understanding of my own demand. It is not only about "How to exit a loop as quickly as possible" as it might appear in my original post as this is actually only half of the question. Looking at subs that naturally shall "short circuit" when applied to a list:

first
all
none

When considering the performance of these implementations within a sub, obviously we should not blame them for the effort needed to place its arguments onto the stack. Therefore the arrays shall be passed by reference to the implementing sub. This is where the first misunderstanding stems from: The term "lazy deref" in my OP implies the need for an array ref to be efficient at all.

Next, I'd like to extend the scenario to a more realistic one. It's not about dying as fast as you can. It's about efficiently process a (large) list while at the same time being able to terminate the process as quickly as possible. Let's take a look on all in its most simple form: are all values of a list/array non-zero?

The question I should have asked in the beginning:

How can we implement an array processing sub that is fast when the whole array has to be processed and is fast too, if there is an early short circuit?

Some questions from this thread:

LanX asked in Re: The Virtue of Laziness:
Why do you think that dereferencing is the issue here?
It is not the issue, it's my presumption.
Again LanX's objection:
map expects a list as input, and those lists can be chained, like in the Schwarzian Transform
map {} grep {} map {} @list
To allow those constructs to be lazy, they would have to stop returning lists.
That's exactly what List::Gen is for. The same chain using List::Gen becomes lazy:
gen {} filter {} gen {} \@list
Though it is is monstrous, outdated and generally slow, it works!
hv asked in Re: The Virtue of Laziness:
do you have a more realistic example of something you'd like to do with a map that you feel could be made faster?
Yes: all et. al.

Let's focus on the all example. Implementations that come to mind are all of List::Util, List::Gen and a simple loop. These behave very different in both situations.

Here is a comparison of these implementations, along with an alternative to the map grep map approach.

Hopefully I got rid of non-standard features that where mindlessly used in my OP as noted by Danny in Re: The Virtue of Laziness.

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

use List::Util ();
use List::Gen qw(:modify :filter);

use Benchmark 'cmpthese';

for my $first (0, 1) {
    my $list = [$first, (1) x (1_000_000)];
    say "first: $first";
    cmpthese(0, {
            lu => sub {all_lu($list)},
            for => sub {all_for($list)},
            lg => sub {all_gen($list)},
        });
    say '';
};

say "chaining:";
my @list = (1 .. 10_000);
# Lookahead (default) would generate one more value
local $List::Gen::LOOKAHEAD = 0;

cmpthese(0, {
        mgm => sub {
            (map {$_->[0]} grep {!($_->[1] % 10)} map {[$_, tf($_)]} @
+list)[0]
        },
        gfg => sub {
            (gen {$_->[0]} filter {!($_->[1] % 10)} gen {[$_, tf($_)]}
+ \@list)->[0]
        },
    });

# artificially "heavy" transformation.
sub tf {
    select(undef, undef, undef, 0.000001);
    $_[0] + 1;
}

# short-circuit "all"
sub all_lu {
    List::Util::all {$_} @{shift()};
}

# short-circuit for-loop
sub all_for {
    for (@{shift()}) {
        return 0 unless $_;
    }
    1;
}

# short-circuit lazy generator
sub all_gen {
    !List::Gen(shift)->do(sub {&last(1) unless $_});
}

__DATA__
first: 0
          Rate       lu       lg      for
lu       396/s       --     -99%    -100%
lg     27574/s    6869%       --    -100%
for 10156692/s 2567064%   36734%       --

first: 1
      Rate    lg   for    lu
lg  4.62/s    --  -90%  -93%
for 48.5/s  949%    --  -31%
lu  70.6/s 1427%   46%    --

chaining:
      Rate    mgm    gfg
mgm 12.5/s     --  -100%
gfg 3790/s 30320%     --
[download]

List::Util would certainly win both races if it operated on array refs.

All in all I don't see the need for a new feature anymore. The overhead of putting a large array onto the stack can be avoided by array references and chainable lazy generators are available via List::Gen.

Everything would be fine, if List::Gen would be maintained. The latest release dates back to 2011 and unit tests are failing since perl v5.19. As far as I can see, changes in perl lead to the failing tests themselves, not the tested functionality. I was able to fix all but three tests easily.

Greetings,
🐻

$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

Comment on Re: The Virtue of Laziness Select or Download Code

Replies are listed 'Best First'.
Re^2: The Virtue of Laziness by hv (Prior) on May 12, 2024 at 23:49 UTC
Thanks, that looks like a much more compelling benchmark. I can well imagine Paul Evans seeing the first two benchmarks and being inspired to make `List::Util::all` (and related functions) competitive with `for` for the "first: 0" case. But hopefully not before completing the perl release that's due soon. :) I believe the special handling of `for (@array)` is switched in by the peephole optimizer; I'm not sure whether that means it would need a new op for each of the relevant `List::Util` functions or whether adding one just for `reduce()` would be sufficient. I'd recommend posting that benchmark (without the "chaining" section) as part of a feature request something like "make List::Util functions as fast as 'for (@array)'" in github. The "chaining" example is another matter: I think solving that would need something a lot closer to the generalized "lazy evaluation" I mentioned before. It may be worth asking about that too, but I'd suggest doing so in a separate ticket to avoid muddying the waters.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^2: The Virtue of Laziness
by hv (Prior) on May 12, 2024 at 23:49 UTC

Thanks, that looks like a much more compelling benchmark. I can well imagine Paul Evans seeing the first two benchmarks and being inspired to make List::Util::all (and related functions) competitive with for for the "first: 0" case. But hopefully not before completing the perl release that's due soon. :)

I believe the special handling of for (@array) is switched in by the peephole optimizer; I'm not sure whether that means it would need a new op for each of the relevant List::Util functions or whether adding one just for reduce() would be sufficient.

I'd recommend posting that benchmark (without the "chaining" section) as part of a feature request something like "make List::Util functions as fast as 'for (@array)'" in github.

The "chaining" example is another matter: I think solving that would need something a lot closer to the generalized "lazy evaluation" I mentioned before. It may be worth asking about that too, but I'd suggest doing so in a separate ticket to avoid muddying the waters.

[reply]
[d/l]
[select]