Memory efficiency, anonymous vs named's vs local subroutines

thanos1983 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Memory efficiency, anonymous vs named's vs local subroutines by GrandFather (Saint) on Jul 18, 2015 at 00:42 UTC
The memory efficiency you should be concerned about is "how can I best remember or figure out how this code works". When computers, even the one in your pocket that you call a phone, have multiple gigabytes of memory, shaving a few bytes here and there is almost always a waste of time. Consider instead how easy it is to understand the intent of the code, how robust the code is against programming errors, and how easy it will be to maintain the code in the future. In all of those cases a named function is a big win because the name can (and should) convey intent. I'm unsure what you mean by a "local subroutine" as all named subroutines in Perl are scoped within the current package. Nesting subroutines makes no difference to their availability to other code. Premature optimization is the root of all job security	[reply]
Re^2: Memory efficiency, anonymous vs named's vs local subroutines by thanos1983 (Parson) on Jul 18, 2015 at 18:52 UTC
Hello GrandFather, First of all thank you for your time and effort reading and replying to my question. You are right, always I am getting into these stupid details. I guess I was more curious to found what function to use when. Who care for a few bytes. The important part as you said first of all, is to be able to understand what it does and then all the rest. Thanks I guess I will go with named routines, simple as that. Local subroutine, I call it because the author of the tutorial Creating Nested Functions was calling it like this. Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re: Memory efficiency, anonymous vs named's vs local subroutines by AnomalousMonk (Archbishop) on Jul 18, 2015 at 02:01 UTC
Further to GrandFather's reply: Not only is it difficult to figure out how such code works, even then it probably doesn't work the way you (thanos1983) think it does. Consider the "`Variable "$x" will not stay shared at ...`" warning(s) you get when you run such code (you are running your code with warnings enabled, right?): `Variable "%s" will not stay shared` `(W closure) An inner (nested) named subroutine is referencing a` `lexical variable defined in an outer named subroutine.` `When the inner subroutine is called, it will see the value of the` `outer subroutine's variable as it was before and during the first` `call to the outer subroutine; in this case, after the first call to` `the outer subroutine is complete, the inner and outer subroutines` `will no longer share a common value for the variable. In other` `words, the variable will no longer be shared.` `This problem can usually be solved by making the inner subroutine` `anonymous, using the "sub {}" syntax. When inner anonymous subs that` `reference variables in outer subroutines are created, they are` `automatically rebound to the current values of such variables.` (See perldiag.) Anonymous subroutines allow proper closures to be formed and maintained. WRT `local`-ized subroutine names (or anything else, for that matter), remember that a localized thingy is visible within the scope of all subroutines subsequently invoked within the localizing scope. I.e., its localization is dynamic. A lexical variable (e.g., one holding a code reference) is visible only within the scope of the block in which it is defined, i.e., its scope is, well, lexical. To me, lexical scope is much to be prefered over dynamic scope unless you have a very clear (and well documented) reason for choosing the latter. As to the difference in memory usage between `local`-ized and lexical subroutines: I must admit I've done no research or experimentation, but I doubt there is any significant difference. Give a man a fish: `<%-(-(-(-<`	[reply] [d/l] [select]
Re^2: Memory efficiency, anonymous vs named's vs local subroutines by thanos1983 (Parson) on Jul 18, 2015 at 19:05 UTC
Hello AnomalousMonk, Thank you for your time and effort. You are right, no matter what tutorial, article I read all they say the same. The possibility when you nesting functions to mess up your code is really high. You assume that your code does one thing, but in reality it does 10 others on the background. I will stick in simple named functions, for every simple implementation called at the main script. Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re^3: Memory efficiency, anonymous vs named's vs local subroutines by AnomalousMonk (Archbishop) on Jul 18, 2015 at 21:14 UTC
I will stick in simple named functions... Simplicity is always a good goal and a good yardstick for judging your code. But I don't want to discourage you from using anonymous lexical subroutines: they work, and they work the way you think they work! In fact, Dominus wrote a whole book (freely available here — and highly recommended!) that's essentially just a zillion ways to use anonymous lexical subroutines. Give a man a fish: `<%-(-(-(-<`	[reply] [d/l]
Re^3: Memory efficiency, anonymous vs named's vs local subroutines by Laurent_R (Canon) on Jul 19, 2015 at 10:05 UTC
I will stick in simple named functions For usual plain-vanilla subroutines, e.g. just pieces of code that you want to use several times and/or call from different places in your program, and that will usually take zero, one or more input parameter(s) and return one or more return value(s), named subroutines are certainly simpler, easier to understand and to maintain. And that's what I would be using in such cases. There are, however, some slightly more advanced techniques that are using anonymous subs and give you a lot of power, just as references to anonymous arrays and hashes allow you to build nested data structures (arrays of arrays, arrays of hashes, etc.) that would be very tedious or in some cases next to impossible to build with named arrays or named hashes. I would also recommend to every reader the Higher Order Perl book, by Mark Jason Dominus, mentioned above by AnomalousMonk, in my view the best CS book I've read in the last ten years. It is available on line on the author's site. Just one word of warning, though: it is not for pure beginners, you need to have at least an intermediate level Perl to really take advantage of it (although the first 2 or 3 chapters are probably accessible to "advanced beginners"). Among the things that anonymous functions make possible or, at least, make usually much easier or more generic: Callback functions (subs passed as a parameter to another sub) Dispatch tables (arrays or hashes of code refs to define program behavior depending on a certain parameter) Function generators and function factories (functions that return many other functions depending on the input parameters) Iterators (functions returning values usually one at a time, on demand) Closures (functions that maintain alive part of their run-time environment) More generally, higher-order functions and abstract generic functions, and quite a few other things that I can't explain in just one line here. In very strict terms, probably none of the techniques above absolutely requires anonymous subs, but only using anonymous subs will really unleash their full power.	[reply]
Re: Memory efficiency, anonymous vs named's vs local subroutines (anon < named ) by BrowserUk (Patriarch) on Jul 18, 2015 at 11:19 UTC
Anonymous subroutines use substantially less memory than named subroutines. For the following simple subs (where nnnnnnn is a number between 0 .. 1e6): `sub Fnnnnnn { my( $a, $b, $c ) = @_; my $x = nnnnnn; return $a * $b - $c; } $f[ nnnnnn ] = sub { my( $a, $b, $c ) = @_; my $x = nnnnnn; return $a * $b - $c; }` [download] The anonymous sub uses ~3k per sub whereas the named version uses ~4.5k per sub (64-bit perl 5.18). Update: specious conclusion removed. (The eval was failing silently and assigning undef to the array.) With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!	[reply] [d/l]
Re^2: Memory efficiency, anonymous vs named's vs local subroutines (anon < named ) by thanos1983 (Parson) on Jul 18, 2015 at 19:38 UTC
Hello BrowserUk, Thank you for your time and effort, reading and replying to my question. Nice, thanks for the information that I was looking. But I guess it seems I will stay with simple named functions to my scripts from now on. Unless it is necessary to work differently (rear case). Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re^2: Memory efficiency, anonymous vs named's vs local subroutines (anon < named ) by locked_user sundialsvc4 (Abbot) on Jul 18, 2015 at 13:58 UTC
I’m sorry, BrowserUK, but I really don’t quite understand the last sentence in that post. There is no `eval` statement in what remains, and therefore I really don’t know if you meant to retract the assertion that you made in the first sentence of your post. I also don’t quite see the point of your code-sample since it does not appear (as it stands now) to be either complete or runnable. Ergo, “huh?” I implicitly assume, always, that “you know whereof you speak,” and especially that you ordinarily work in resource-intensive applications, but I do not grok what you are trying to say here. Maybe it’s the consequence of previous-versions and edits that I never saw? Please clarify, preferably in a reply to, vs. a rewrite of, the above post. Thank you.
Re^3: Memory efficiency, anonymous vs named's vs local subroutines (anon < named ) by Anonymous Monk on Jul 18, 2015 at 14:06 UTC
You don't even mark updates to your own nodes (example), so you don't get to complain to others.	[reply]
Re: Memory efficiency, anonymous vs named's vs local subroutines by Laurent_R (Canon) on Jul 18, 2015 at 09:51 UTC
With today's computers, I do not see any reason to worry about the few bytes or even a few kB that you are going to save with one solution to create your subs compared to another. I think your main aim should be to make your code clear to read and to understand and easy to maintain, don't worry about a few kilobytes memory footprint. If it came to processing a huge file (say 10 GB, or perhaps even 1 GB)), then it would really make sense to think before choosing the algorithm whether it will store the whole file into memory or whether it will gently iterate over lines or chunks of the file. That's a totally different context, however. But choosing between between various ways of implementing your subs (named subs, code_refs or anonymous subs, closures, class methods, etc., don't worry about memory usage, this is really irrelevant in most cases. One slight warning, though: watch out for possible memory leaks and similar problems in your implementation (circular references, perhaps deep recursion, etc.), especially if your program is going to do really a lot of work or to run for a fairly long time. And even then, a memory link is not necessarily dramatic if you can ascertain that it will be only a few kilobytes and are sure that the program will always complete quickly enough so that the leak will never become a problem.	[reply]
Re^2: Memory efficiency, anonymous vs named's vs local subroutines by thanos1983 (Parson) on Jul 18, 2015 at 19:33 UTC
Hello Laurent_R, Thank you for your time and effort, reading and replying to my question. You are right, maintenance comes top in the list. A few Bytes or even KBytes with the resources that we have today will not make any difference. Memory leakage is something that I should be aware and careful for. My way of moving forward would be using named functions is to understand and to implement. Small pieces of code to construct something bigger. Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re: Memory efficiency, anonymous vs named's vs local subroutines by shmem (Chancellor) on Jul 18, 2015 at 17:22 UTC
Anonymous subs are cheaper in terms of usage (invoking a named sub includes a round-trip to the symbol table) and memory. Named subs: use strict; use warnings; my $begin; BEGIN { chop( $begin = `ps -o vsz= $$` )} eval "sub F_$_ { my( \$a, \$b, \$c ) = \@_; my \$x = $_; return \$a * \$b - (\$x + \$c); }" for 1..1e4; END { chop( my $end = `ps -o vsz= $$` ); print "$end - $begin = ",$end - $begin, "\n"; } __END__ 60924 - 22228 = 38696 [download] Anonymous subs: use strict; use warnings; my $begin; BEGIN { chop( $begin = `ps -o vsz= $$` )} my @ary; $ary[$_] = eval "sub { my( \$a, \$b, \$c ) = \@_; my \$x = $_; return \$a * \$b - (\$x + \$c); }" for 1..1e4; END { chop( my $end = `ps -o vsz= $$` ); print "$end - $begin = ",$end - $begin, "\n"; } __END__ 53356 - 22228 = 31128 [download] That makes ca. 3.1kB per anonsub, 3.9 per named sub. Slightly different numbers than BrowserUk's above, but also slightly different architecture and version of perl: perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-gnu-thread-multi update: The ca. 750 extra bytes for named subs may well be just the cost of allocating a GLOB for the subroutine in the symbol table, which comes wiith SCALAR, HASH, ARRAY, CODE and FILEHANDLE slots. Currently I don't recall whether they are autovivified as needed or allocated in one go. perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'	[reply] [d/l] [select]
Re^2: Memory efficiency, anonymous vs named's vs local subroutines by BrowserUk (Patriarch) on Jul 18, 2015 at 18:14 UTC
The ca. 750 extra bytes for named subs may well be just the cost of allocating a GLOB for the subroutine in the symbol table, Running the code below to create a million STASH aliases to a single sub: `sub F123456{ my( $a, $b, $c )= @_; my $x = 123456; return $a * $b - $ +c; };; *{"X$_"} = \&F123456 for 0 .. 1e6;;` [download] results in an increase in the process size of almost exactly 300MB, thus 300 bytes per STASH entry. Some of the extra space may be down to unreused (but reusable) space allocated and freed during the doubling of the STASH hash as it grows. I can't think of any way to isolate that. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!	[reply] [d/l]
Re^3: Memory efficiency, anonymous vs named's vs local subroutines by shmem (Chancellor) on Jul 19, 2015 at 07:21 UTC
Some of the extra space may be down to unreused (but reusable) space allocated and freed during the doubling of the STASH hash as it grows. Devel::Peek shows that stash entries created by coderef assignment to a GLOB are missing the MAGIC part of a GV: use Devel::Peek; sub foo { print "in foo\n"; } bar = \&foo; print "----\nfoo:\n----\n"; Dump(foo); print "----\nbar\n----\n"; Dump(bar); __END__ ---- foo: ---- SV = PVGV(0x1765790) at 0x1731550 REFCNT = 3 FLAGS = (RMG,MULTI,IN_PAD) MAGIC = 0x1732e80 MG_VIRTUAL = &PL_vtbl_backref MG_TYPE = PERL_MAGIC_backref(<) MG_OBJ = 0x1725238 NAME = "foo" NAMELEN = 3 GvSTASH = 0x1705800 "main" GP = 0x1732870 SV = 0x0 REFCNT = 1 IO = 0x0 FORM = 0x0 AV = 0x0 HV = 0x0 CV = 0x1725238 CVGEN = 0x0 LINE = 4 FILE = "f.pl" FLAGS = 0xa EGV = 0x1731550 "foo" ---- bar ---- SV = PVGV(0x17657c0) at 0x1725208 REFCNT = 3 FLAGS = (MULTI,ASSUMECV,IN_PAD) NAME = "bar" NAMELEN = 3 GvSTASH = 0x1705800 "main" GP = 0x1759a40 SV = 0x0 REFCNT = 1 IO = 0x0 FORM = 0x0 AV = 0x0 HV = 0x0 CV = 0x1725238 CVGEN = 0x0 LINE = 5 FILE = "f.pl" FLAGS = 0xe EGV = 0x1725208 "bar" [download] I don't think the sizes reported by Devel::Size are accurate since the stash entry created by coderef assignment is reported as being bigger. Go figure. `use Devel::Size qw(size total_size); sub foo { print "in foo\n"; } print "size foo: ",size(foo),"\n"; print "total_size foo: ",total_size(foo),"\n"; print "size bar: ",size(bar),"\n"; print "total_size bar: ",total_size(bar),"\n"; print "difference: ",size(foo) - size(*bar),"\n"; __END__ size foo: 7034 total_size foo: 7034 size bar: 7186 total_size bar: 7186 difference: -152` [download] perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'	[reply] [d/l] [select]
Re: Memory efficiency, anonymous vs named's vs local subroutines by Anonymous Monk on Jul 18, 2015 at 13:10 UTC
Just to demonstrate the effect of dynamic scoping that AnomalousMonk already described: `sub foo { local *bar = sub { print "other bar\n"; }; print "foo calling quz\n"; quz(); } sub bar { print "bar\n"; } sub quz { print "quz calling bar\n"; bar(); } foo(); quz(); __END__ Subroutine main::bar redefined at - line 4. foo calling quz quz calling bar other bar quz calling bar bar` [download] Despite the warning about the subroutine being redefined, this can cause some spooky action-at-a-distance effects: How do you know which `bar()` you'll be calling (as demonstrated by `sub quz`)?	[reply] [d/l] [select]
Re^2: Memory efficiency, anonymous vs named's vs local subroutines by thanos1983 (Parson) on Jul 18, 2015 at 19:42 UTC
Hello Anonymoys, Thank you for the demonstration, it helped. Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re: Memory efficiency, anonymous vs named's vs local subroutines by Anonymous Monk on Jul 18, 2015 at 00:45 UTC
But why do you want nested subs in the first place? You're not making closures, right? One common way to get "private" helper subs in your module is: `sub something_useful { my $x = _helper(shift); return $x; } sub _helper { ... }` [download] One major advantage is in testing: the helper subs can be called directly.	[reply] [d/l]
Re^2: Memory efficiency, anonymous vs named's vs local subroutines by thanos1983 (Parson) on Jul 18, 2015 at 18:56 UTC
Hello Anonymous, Thank you for your time and effort, reading and replying to my question. No I am not planning to use nested subs. I guess a solution like the one you proposed is the most efficient and correct to follow. Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]