map: chaining vs. nesting

Roy Johnson has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: map: chaining vs. nesting by diotalevi (Canon) on Jun 18, 2004 at 14:10 UTC
1. Nothing specific outside of a general recognition that some copies can be avoided. A map's result is pushed onto the stack (assuming list context) for the subsequent op to do something with it. It is a copy but it is marked as something that can be recycled. Stuff that recognizes that (like map itself) can avoid making extra copies. I think `= map( EXPR, map( EXPR, LIST ) )` copies the values from LIST once and then as possible, just re-uses the copies for the next map which gives it to the aassign which also notes that the input is recycleable and steals it instead of making another copy. 2. Not terribly useful. I used chained maps where the output from one is not the same size as the input. The subsequent maps deal with these entirely different lists. I suppose it could be moved into the map but that just makes the code harder to read. 3. See #1. Something of a fashion, in a more general sense already happens.	[reply] [d/l]
Re^2: map: chaining vs. nesting by Roy Johnson (Monsignor) on Jun 18, 2004 at 15:55 UTC
Regarding #2: The fact that map output size differs from input size doesn't mean there's no benefit to nesting vs. chaining. The benefit I expected was in memory overhead (from not having so many copies resident at once) more than in reduction of operations (from copying multiple times). I agree with your point about nesting being harder to read; that's why I suggested it as an optimization. Putting the idea in better terms, I'm suggesting that map, in an ideal world where perl development was effortless, should behave like `(1..100000)`: when its output is assigned, the range generates the whole list; but when its output is used iteratively (at least, in a foreach loop), it simply generates one output value at a time, as needed. map, grep, and ranges could all behave that way when feeding map, grep, or foreach. We're not really tightening our belts, it just feels that way because we're getting fatter.	[reply] [d/l]
Re^3: map: chaining vs. nesting by diotalevi (Canon) on Jun 18, 2004 at 16:23 UTC
Lazy generation is a feature you can specify in perl6. Not yet so for perl5.	[reply]
Re: map: chaining vs. nesting by hardburn (Abbot) on Jun 18, 2004 at 14:30 UTC
When I see those constructs, I wonder whether Perl is doing anything clever to avoid unnecessary copying No, a list is generated at the end of each map. So the ST, for example: `my @sorted_data = map { $_->[1] } sort { $a->[0] <=> $b->[0] } map {[ do_something($_), $_ ]} @unsorted_data;` [download] Takes about 3 times as much memory as `@unsorted_data` orginally used. It also happens to be efficient because of an optimized sort step. CPU/memory tradeoffs generally favor the CPU--in fact, Perl itself is one gigantic CPU/memory tradeoff. Further meditiation: LISP and other old interpreted languages were not well received when they were still young. Its detractors claimed it took too much computing power to handle the virtual machine it required. Today, a lot of very popular languages run under VMs (Perl, Java, Python, C#/.NET, etc.), and often run without significantly cutting into the resources available on a modern system. So the old reasoning should be considered moot. However, there is a second reason why LISP and other functional langauges are inefficient--that being the memory requirements to handle the massively-chained operations that functional programming styles encourage. With significant ammounts of data, it could easily overwhelm the RAM of even modern systems. ---- send money to your kernel via the boot loader.. This and more wisdom available from Markov Hardburn.	[reply] [d/l] [select]
Re^2: map: chaining vs. nesting by diotalevi (Canon) on Jun 18, 2004 at 14:35 UTC
Not quite. There's a check in map to see whether the input is a TEMP or not and if so, it is re-used for the output. The list assignment operator also notices that the input is stealable and steals the scalars away to be the kept-for-real copies in @sorted_data. So one copy is made. I'd have written this one as a GRT or tye's fast, flexible, stable sort, though instead.	[reply]
Re^2: map: chaining vs. nesting by danderson (Beadle) on Jun 18, 2004 at 18:42 UTC
They are only inefficient to a point. If, in Lisp\|Perl\|C\|whatever, you write a recursive function that runs a few thousand times, then you're going to suck up some RAM and the stack-push/pop CPU time. LISP, Scheme, ML, and their compadres are pretty well designed to minimize this. But, if you write a tail-recursive function, you're fine; overhead is exactly the same as for iteration. So it's all about how you implement your algorithms. Any language can be inefficient - `m/(a\|b+)*/` being a simple example I can remember being horrible in Perl. Besides, though most style guides for functional languages don't encourage it, they almost all have iteration constructs, even if they're not popular. I know LISP does; I think Scheme and ML do.	[reply] [d/l]
Re^3: map: chaining vs. nesting by Errto (Vicar) on Jun 19, 2004 at 01:17 UTC
Indeed. I spent a year working rather intensively on both ML and Haskell (which are structurally similar but have some crucial differences) and one of the first things I learned is that while it's tempting to implement a factorial like this: `fact 0 = 1 fact 1 = 1 fact n \| n < 0 = error "Negative factorial!" \| otherwise = n * fact (n - 1)` [download] your life will be vastly improved if you implement it like this: `fact 0 = 1 fact 1 = 1 fact n \| n < 0 = error "Negative factorial!" \| otherwise = g 1 n where g x 1 = x g x n' = g (x * n') (n' - 1)` [download] because the latter uses tail recursion and the former doesn't.	[reply] [d/l] [select]
Re^3: map: chaining vs. nesting by FoxtrotUniform (Prior) on Jun 18, 2004 at 18:54 UTC
If, in Lisp\|Perl\|C\|whatever, you write a recursive function that runs a few thousand times, then you're going to suck up some RAM and the stack-push/pop CPU time. LISP, Scheme, ML, and their compadres are pretty well designed to minimize this. But, if you write a tail-recursive function, you're fine; overhead is exactly the same as for iteration. Tail recursion can be converted directly into iteration; this is part of the Scheme standard, I believe, and most implementations of other functional languages support it. If you're unlucky enough to have an implementation that doesn't, though, tail recursion will still suck up stack space for function calls. `-- F o x t r o t U n i f o r m Found a typo in this node? /msg me % man 3 strfry`	[reply]
Re: map: chaining vs. nesting by ihb (Deacon) on Jun 19, 2004 at 20:37 UTC
This node may or may not be of interest in regards to your overhead question. `ihb`	[reply] [d/l]