The Virtue of Laziness

Replies are listed 'Best First'.
Re: The Virtue of Laziness by hv (Prior) on May 05, 2024 at 13:48 UTC
Perl is a stack-based language: in the general case, the arguments to some function are put on the stack, and the function knows to consume them from there, and replace them with its return value. The job of `map` is to take a list of items and replace it with another list of items, and the way the implementation works means it can mostly reuse the space of the input list. Thus the optimization in `for` when iterating over a single explicit array is less relevant for `map`. `map {die} @array` clearly isn't a very useful thing to do, do you have a more realistic example of something you'd like to do with a `map` that you feel could be made faster? Wouldn't it be nice to have some kind of "explicit lazy dereferencing" in Perl? If you want lazy evaluation in general, I think it's unlikely you'll ever get that in Perl. If you just want something specifically for mapping over a literal array, that's almost certainly doable - but it would add complexity to the perl core as well as to the language people have to learn, so the benefits would have to justify those costs. LanX wrote: I'm sceptical [p5p] have resource left for that. There is no pool of resources to be assigned, rather there is a loose collection of individuals working on things they find interesting enough to spend their time on. So having a use case that someone finds compelling is the starting point.	[reply] [d/l] [select]
Re^2: The Virtue of Laziness by jo37 (Curate) on May 05, 2024 at 14:55 UTC
...do you have a more realistic example of something you'd like to do with a map that you feel could be made faster? As LanX already proposed, it is the set of functions in `List::Util` et al. doing short circuit that could benefit from such optimization. I didn't want to bring modules into the game when I can describe the behaviour without such. Greetings, 🐻 `$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$`	[reply] [d/l]
Re^2: The Virtue of Laziness by LanX (Saint) on May 05, 2024 at 14:33 UTC
> do you have a more realistic example of something you'd like to do with a map that you feel could be made faster? Well `first {} LIST` from List::Util , is a specialized grep which stops processing after success. And not needing to preallocate a huge list with 1e6 entries would be faster. Cheers Rolf _{(addicted to the Perl Programming Language :) see Wikisyntax for the Monastery}	[reply] [d/l]
Re^2: The Virtue of Laziness by LanX (Saint) on May 05, 2024 at 15:07 UTC
> If you want lazy evaluation in general, I think it's unlikely you'll ever get that in Perl. Not sure what is meant with "in general" Changing all map and grep to lazy would most certainly cause problems with old code which relies on the time the elements of the list are evaluated. One would need new commands like lmap (lazy) or igrep (iterator) to implement this. I did a POC once with iterators, problem was the performance of those iterators after the third loop level was abysmal. And there are dragons concerning evaluation timing. Not sure if the model in Haskell or Python should be considered canonic. Cheers Rolf _{(addicted to the Perl Programming Language :) see Wikisyntax for the Monastery}	[reply]
Re^2: The Virtue of Laziness by jo37 (Curate) on May 12, 2024 at 21:51 UTC
See Re: The Virtue of Laziness Greetings, 🐻 `$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$`	[reply]
Re: The Virtue of Laziness by LanX (Saint) on May 04, 2024 at 22:28 UTC
Why do you think that dereferencing is the issue here? Is map faster if you use a literal array @ar? Update I doubt it. `map` expects a list as input, and those lists can be chained, like in the Schwarzian Transform `map {} grep {} map {} @list` To allow those constructs to be lazy, they would have to stop returning lists. IIRC... the reason why `foreach` can be efficient is that it has special magic if the THING inside brackets is one array (starts with @) or a range operator. In that case it'll switch to iteration by index. That's actually something the earliest versions of perl5 couldn't do. So it's not real lazyness but just an optimized edge case. I doubt that a `for ("a",@ar,"z") {}` is still that performant, because the heuristics might not detect anymore the array in between Short version This perceived lazyness depends on heuristics, which are hard to extend and normally won't bring much benefit. Cheers Rolf _{(addicted to the Perl Programming Language :) see Wikisyntax for the Monastery}	[reply] [d/l] [select]
Re^2: The Virtue of Laziness by jo37 (Curate) on May 05, 2024 at 07:14 UTC
I doubt that a `for ("a",@ar,"z") {}` is still that performant, because the heuristics might not detect anymore the array in between LanX, you are right: `cmpthese(0, { map => sub {map_die(\@arr)}, for => sub {for_die(\@arr)}, }); sub map_die ($ar) { eval {map {die} 0, @$ar}; } sub for_die ($ar) { eval { for (0, @$ar) { die; } } } __DATA__ Rate for map for 1390/s -- -1% map 1404/s 1% --` [download] Greetings, 🐻 `$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$`	[reply] [d/l] [select]
Re^3: The Virtue of Laziness by LanX (Saint) on May 05, 2024 at 10:14 UTC
Well, you could ask p5p if it's possible to catch the edge cases of iterating over a following array variable `map {} @array` `grep {} @array` But, I'm sceptical they have resource left for that. Cheers Rolf _{(addicted to the Perl Programming Language :) see Wikisyntax for the Monastery}	[reply] [d/l] [select]
Re^2: The Virtue of Laziness by jo37 (Curate) on May 12, 2024 at 21:46 UTC
See Re: The Virtue of Laziness Greetings, 🐻 `$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$`	[reply]
Re: The Virtue of Laziness by Danny (Chaplain) on May 05, 2024 at 19:54 UTC
It might be useful to include `use feature 'signatures';` in such examples that use signatures. I've never used them so it wasn't obvious to me what was going on.	[reply] [d/l]
Re^2: The Virtue of Laziness by jo37 (Curate) on May 06, 2024 at 08:33 UTC
Updated the original node, though this comes too late for you. Greetings, 🐻 `$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$`	[reply]
Re: The Virtue of Laziness by jo37 (Curate) on May 12, 2024 at 21:43 UTC
Replying to myself for a summary, hope that's ok. Preface: I'm not trying to solve any real world problems. Please don't take these thoughts too serious. The questions asked in this thread lead me to a clearer understanding of my own demand. It is not only about "How to exit a loop as quickly as possible" as it might appear in my original post as this is actually only half of the question. Looking at subs that naturally shall "short circuit" when applied to a list: `first` `all` `none` When considering the performance of these implementations within a sub, obviously we should not blame them for the effort needed to place its arguments onto the stack. Therefore the arrays shall be passed by reference to the implementing sub. This is where the first misunderstanding stems from: The term "lazy deref" in my OP implies the need for an array ref to be efficient at all. Next, I'd like to extend the scenario to a more realistic one. It's not about dying as fast as you can. It's about efficiently process a (large) list while at the same time being able to terminate the process as quickly as possible. Let's take a look on `all` in its most simple form: are all values of a list/array non-zero? The question I should have asked in the beginning: How can we implement an array processing sub that is fast when the whole array has to be processed and* is fast too, if there is an early short circuit?* Some questions from this thread: LanX asked in Re: The Virtue of Laziness: Why do you think that dereferencing is the issue here? It is not the issue, it's my presumption. Again LanX's objection: map expects a list as input, and those lists can be chained, like in the Schwarzian Transform `map {} grep {} map {} @list` To allow those constructs to be lazy, they would have to stop returning lists. That's exactly what `List::Gen` is for. The same chain using `List::Gen` becomes lazy: `gen {} filter {} gen {} \@list` Though it is is monstrous, outdated and generally slow, it works! hv asked in Re: The Virtue of Laziness: do you have a more realistic example of something you'd like to do with a map that you feel could be made faster? Yes: `all` et. al. Let's focus on the `all` example. Implementations that come to mind are all of List::Util, List::Gen and a simple loop. These behave very different in both situations. Here is a comparison of these implementations, along with an alternative to the `map grep map` approach. Hopefully I got rid of non-standard features that where mindlessly used in my OP as noted by Danny in Re: The Virtue of Laziness. #!/usr/bin/perl use strict; use warnings; use feature 'say'; use List::Util (); use List::Gen qw(:modify :filter); use Benchmark 'cmpthese'; for my $first (0, 1) { my $list = [$first, (1) x (1_000_000)]; say "first: $first"; cmpthese(0, { lu => sub {all_lu($list)}, for => sub {all_for($list)}, lg => sub {all_gen($list)}, }); say ''; }; say "chaining:"; my @list = (1 .. 10_000); # Lookahead (default) would generate one more value local $List::Gen::LOOKAHEAD = 0; cmpthese(0, { mgm => sub { (map {$_->[0]} grep {!($_->[1] % 10)} map {[$_, tf($_)]} @ +list)[0] }, gfg => sub { (gen {$_->[0]} filter {!($_->[1] % 10)} gen {[$_, tf($_)]} + \@list)->[0] }, }); # artificially "heavy" transformation. sub tf { select(undef, undef, undef, 0.000001); $_[0] + 1; } # short-circuit "all" sub all_lu { List::Util::all {$_} @{shift()}; } # short-circuit for-loop sub all_for { for (@{shift()}) { return 0 unless $_; } 1; } # short-circuit lazy generator sub all_gen { !List::Gen(shift)->do(sub {&last(1) unless $_}); } __DATA__ first: 0 Rate lu lg for lu 396/s -- -99% -100% lg 27574/s 6869% -- -100% for 10156692/s 2567064% 36734% -- first: 1 Rate lg for lu lg 4.62/s -- -90% -93% for 48.5/s 949% -- -31% lu 70.6/s 1427% 46% -- chaining: Rate mgm gfg mgm 12.5/s -- -100% gfg 3790/s 30320% -- [download] `List::Util` would certainly win both races if it operated on array refs. All in all I don't see the need for a new feature anymore. The overhead of putting a large array onto the stack can be avoided by array references and chainable lazy generators are available via `List::Gen`. Everything would be fine, if `List::Gen` would be maintained. The latest release dates back to 2011 and unit tests are failing since perl v5.19. As far as I can see, changes in perl lead to the failing tests themselves, not the tested functionality. I was able to fix all but three tests easily. Greetings, 🐻 `$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$`	[reply] [d/l] [select]
Re^2: The Virtue of Laziness by hv (Prior) on May 12, 2024 at 23:49 UTC
Thanks, that looks like a much more compelling benchmark. I can well imagine Paul Evans seeing the first two benchmarks and being inspired to make `List::Util::all` (and related functions) competitive with `for` for the "first: 0" case. But hopefully not before completing the perl release that's due soon. :) I believe the special handling of `for (@array)` is switched in by the peephole optimizer; I'm not sure whether that means it would need a new op for each of the relevant `List::Util` functions or whether adding one just for `reduce()` would be sufficient. I'd recommend posting that benchmark (without the "chaining" section) as part of a feature request something like "make List::Util functions as fast as 'for (@array)'" in github. The "chaining" example is another matter: I think solving that would need something a lot closer to the generalized "lazy evaluation" I mentioned before. It may be worth asking about that too, but I'd suggest doing so in a separate ticket to avoid muddying the waters.	[reply] [d/l] [select]

Update

Short version