nudge has asked for the wisdom of the Perl Monks concerning the following question:

Dear Brethren,

Let us imagine for a moment that we wished to find the number of even numbers in the range 1 .. 1_000_000_000, then following code should work...

my $n = grep { $_ % 2 == 0 } (1 .. 1_000_000_000);

However my machine reflects upon this and responds thusly:

panic: realloc at ./inf line 6.

While perl is correctly lazy in it's evaluation of a ranage, it get very industrious when you wish to pass it to grep, map and friends.

While there are a number of CPAN modules that work around this with there own lazy versions of map and grep, does anyone know why perl's own functions have never been changed to the appropriately slothful?

NB: The question is not how, that is trivial. The question is why is perl still the way it is?

--
Fools ignore complexity. Pragmatists suffer it.
Some can avoid it. Geniuses remove it.

Replies are listed 'Best First'.
Re: Making perl's map and grep more large list friendly
by ikegami (Patriarch) on Jun 20, 2010 at 08:00 UTC

    While perl is correctly lazy in it's evaluation of a ranage, it get very industrious when you wish to pass it to grep, map and friends.

    Completely backwards. Ranges aren't lazy. There's nothing special about how grep, map and friends interact with ranges.

    does anyone know why perl's own functions have never been changed to the appropriately slothful?

    The parser could replace grep BLOCK RANGE with something equivalent that doesn't involve a range. This is what it does for for (RANGE).

    I'm guessing it's simply a question that noone has taken the time to do it.

Re: Making perl's map and grep more large list friendly
by JavaFan (Canon) on Jun 20, 2010 at 13:41 UTC
    While perl is correctly lazy in it's evaluation of a ranage

    It's not. There's one special case though, and that's if the list of a for is just a range. This optimization was introduced in 5.005.

    The question is why is perl still the way it is?
    Because you haven't written a patch for it yet.

    That's generally the way how Perl has been evolved for the last decade, or even longer. There isn't a commission that sits down and says "hmmm, what nice things for our users shall we do for this release?". Instead, people find things bothersome. Or find features lacking. If it itches them enough, they write a patch.

    So, let's bounce back the question. Why haven't you written a patch to optimize this yet? Then you have the answer.

      Because you haven't written a patch for it yet.

      At my age one learns to look before one leaps. It may be that it is an intentional design decision, not simply an oversight. I would seem foolish to me to pour a significant volume of time rooting around in a perl's monstrous code base finding the bits to patch, patching them, only then to be told "no, no, we cant do that because of this".

      PS: After spending most of last night rooting around in perl's intestinal tract, seem no closer to tacking down where map and grep are implemented. Anyone know off the top of their shaved heads?.

        I would seem foolish to me to pour a significant volume of time rooting around in a perl's monstrous code base finding the bits to patch, patching them, only then to be told "no, no, we cant do that because of this".

        You could announce your intentions to the committers on the p5p mailing list or on their IRC channel.

        no closer to tacking down where map and grep are implemented. Anyone know off the top of their shaved heads?.

        $ perl -MO=Concise -e'grep { f() } $x..$y;' c <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 2 -e:1) v:{ ->3 8 <|> grepwhile(other->9)[t7] vK ->c 7 <@> grepstart K* ->8 3 <0> pushmark s ->4 - <1> null lK/1 ->4 - <1> null sK/1 ->8 - <@> scope sK ->8 - <0> ex-nextstate v ->9 b <1> entersub[t2] sKS/TARG,1 ->- - <1> ex-list sK ->b 9 <0> pushmark s ->a - <1> ex-rv2cv sK ->- a <#> gv[*f] s/EARLYCV ->b - <1> null lKM/1 ->7 6 <1> flop lKM ->7 e <1> flip[t6] lK ->7 4 <|> range(other->5)[t5] lK/1 ->d - <1> ex-rv2sv sK/1 ->e d <#> gvsv[*x] s ->e - <1> ex-rv2sv sK/1 ->6 5 <#> gvsv[*y] s ->6 -e syntax OK

        So pp_grepstart and pp_grepwhile (in pp*.c) for grep. Similarly, map is pp_mapstart and pp_mapwhile.

        Keep in mind that the list is already flattened before grep and map see it. First, you'd need to create new ops or use the "S"tacked flag like Perl does for for. (It doesn't appear to be used by the relevant ops.)

        $ perl -MO=Concise -e'1 for $x..$y;' ... 8 <{> enteriter(next->9 last->c redo->9) lKS/8 ->a ... ^ | Stacked: Is a counting loop $ perl -MO=Concise -e'1 for (),$x..$y;' ... 9 <{> enteriter(next->a last->d redo->a) lK/8 ->b ... ^ | Not Stacked: Is a foreach loop

        Then, you'd need to alter the compiler to produce the right opcode tree when a range is provided.

        Update: s/Special/Stacked/

        At my age one learns to look before one leaps. It may be that it is an intentional design decision, not simply an oversight.
        It's neither an intentional design decision, nor an oversight. In Perl, in general, arguments are evaluated before the sub (or op) is called. So, it's logical that the list is expanded first. It's only later that the case of foreach (x .. y) (but not foreach (x .. y, z) or foreach (z, x .. y)) got optimized as that was a common case, and a lot of memory (and hence, time) could be saved. My guess is that the author who implemented this optimization didn't think the case for grep or map was as pressing - one would expect the result (in list context) of the grep and map to be long lists as well.

        Note that I wasn't trying to make a snide remark. People having itches is the main driving force behind Perl.

        "no, no, we cant do that because of this".
        If your patch doesn't break backwards compatability, and doesn't punish (slowdown, more memory usage, etc) code that doesn't benefit from your patch, there's a good chance it gets accepted. Even if it does contain bugs, isn't perfectly written or causes a problem on some platform you don't have access to, there's a good chance someone else cleans up your effort -- even if it takes a while. Contributions, even if not perfect, are welcomed.
Re: Making perl's map and grep more large list friendly
by kejohm (Hermit) on Jun 20, 2010 at 07:47 UTC

    The best way I can think of would be to use an ordinary loop construct. Behold:

    my $n; foreach(1 .. 1_000_000_000){ $n++ if $_ % 2 == 0; }

    There are probably others ways (this is Perl after all :).

    Update: Sorry, misread the question.