Re: Memory efficiency, anonymous vs named's vs local subroutines
by GrandFather (Saint) on Jul 18, 2015 at 00:42 UTC
|
The memory efficiency you should be concerned about is "how can I best remember or figure out how this code works". When computers, even the one in your pocket that you call a phone, have multiple gigabytes of memory, shaving a few bytes here and there is almost always a waste of time.
Consider instead how easy it is to understand the intent of the code, how robust the code is against programming errors, and how easy it will be to maintain the code in the future. In all of those cases a named function is a big win because the name can (and should) convey intent.
I'm unsure what you mean by a "local subroutine" as all named subroutines in Perl are scoped within the current package. Nesting subroutines makes no difference to their availability to other code.
Premature optimization is the root of all job security
| [reply] |
|
|
Hello GrandFather,
First of all thank you for your time and effort reading and replying to my question. You are right, always I am getting into these stupid details. I guess I was more curious to found what function to use when. Who care for a few bytes. The important part as you said first of all, is to be able to understand what it does and then all the rest.
Thanks I guess I will go with named routines, simple as that.
Local subroutine, I call it because the author of the tutorial Creating Nested Functions was calling it like this.
Seeking for Perl wisdom...on the process of learning...not there...yet!
| [reply] [d/l] [select] |
Re: Memory efficiency, anonymous vs named's vs local subroutines
by AnomalousMonk (Archbishop) on Jul 18, 2015 at 02:01 UTC
|
Further to GrandFather's reply: Not only is it difficult to figure out how such code works, even then it probably doesn't work the way you (thanos1983) think it does.
Consider the "Variable "$x" will not stay shared at ..." warning(s) you get when you run such code (you are running your code with warnings enabled, right?):
Variable "%s" will not stay shared
(W closure) An inner (nested) *named* subroutine is referencing a
lexical variable defined in an outer named subroutine.
When the inner subroutine is called, it will see the value of the
outer subroutine's variable as it was before and during the *first*
call to the outer subroutine; in this case, after the first call to
the outer subroutine is complete, the inner and outer subroutines
will no longer share a common value for the variable. In other
words, the variable will no longer be shared.
This problem can usually be solved by making the inner subroutine
anonymous, using the "sub {}" syntax. When inner anonymous subs that
reference variables in outer subroutines are created, they are
automatically rebound to the current values of such variables.
(See perldiag.) Anonymous subroutines allow proper closures to be formed and maintained.
WRT local-ized subroutine names (or anything else, for that matter), remember that a localized thingy is visible within the scope of all subroutines subsequently invoked within the localizing scope. I.e., its localization is dynamic. A lexical variable (e.g., one holding a code reference) is visible only within the scope of the block in which it is defined, i.e., its scope is, well, lexical. To me, lexical scope is much to be prefered over dynamic scope unless you have a very clear (and well documented) reason for choosing the latter.
As to the difference in memory usage between local-ized and lexical subroutines: I must admit I've done no research or experimentation, but I doubt there is any significant difference.
Give a man a fish: <%-(-(-(-<
| [reply] [d/l] [select] |
|
|
Hello AnomalousMonk,
Thank you for your time and effort. You are right, no matter what tutorial, article I read all they say the same. The possibility when you nesting functions to mess up your code is really high. You assume that your code does one thing, but in reality it does 10 others on the background.
I will stick in simple named functions, for every simple implementation called at the main script.
Seeking for Perl wisdom...on the process of learning...not there...yet!
| [reply] [d/l] [select] |
|
|
I will stick in simple named functions...
Simplicity is always a good goal and a good yardstick for judging your code. But I don't want to discourage you from using anonymous lexical subroutines: they work, and they work the way you think they work! In fact, Dominus wrote a whole book (freely available here — and highly recommended!) that's essentially just a zillion ways to use anonymous lexical subroutines.
Give a man a fish: <%-(-(-(-<
| [reply] [d/l] |
|
|
I will stick in simple named functions
For usual plain-vanilla subroutines, e.g. just pieces of code that you want to use several times and/or call from different places in your program, and that will usually take zero, one or more input parameter(s) and return one or more return value(s), named subroutines are certainly simpler, easier to understand and to maintain. And that's what I would be using in such cases.
There are, however, some slightly more advanced techniques that are using anonymous subs and give you a lot of power, just as references to anonymous arrays and hashes allow you to build nested data structures (arrays of arrays, arrays of hashes, etc.) that would be very tedious or in some cases next to impossible to build with named arrays or named hashes.
I would also recommend to every reader the Higher Order Perl book, by Mark Jason Dominus, mentioned above by AnomalousMonk, in my view the best CS book I've read in the last ten years. It is available on line on the author's site. Just one word of warning, though: it is not for pure beginners, you need to have at least an intermediate level Perl to really take advantage of it (although the first 2 or 3 chapters are probably accessible to "advanced beginners").
Among the things that anonymous functions make possible or, at least, make usually much easier or more generic:
- Callback functions (subs passed as a parameter to another sub)
- Dispatch tables (arrays or hashes of code refs to define program behavior depending on a certain parameter)
- Function generators and function factories (functions that return many other functions depending on the input parameters)
- Iterators (functions returning values usually one at a time, on demand)
- Closures (functions that maintain alive part of their run-time environment)
- More generally, higher-order functions and abstract generic functions, and quite a few other things that I can't explain in just one line here.
In very strict terms, probably none of the techniques above absolutely requires anonymous subs, but only using anonymous subs will really unleash their full power.
| [reply] |
Re: Memory efficiency, anonymous vs named's vs local subroutines (anon < named )
by BrowserUk (Patriarch) on Jul 18, 2015 at 11:19 UTC
|
Anonymous subroutines use substantially less memory than named subroutines.
For the following simple subs (where nnnnnnn is a number between 0 .. 1e6):
sub Fnnnnnn {
my( $a, $b, $c ) = @_;
my $x = nnnnnn;
return $a * $b - $c;
}
$f[ nnnnnn ] = sub {
my( $a, $b, $c ) = @_;
my $x = nnnnnn;
return $a * $b - $c;
}
The anonymous sub uses ~3k per sub whereas the named version uses ~4.5k per sub (64-bit perl 5.18).
Update: specious conclusion removed. (The eval was failing silently and assigning undef to the array.)
| [reply] [d/l] |
|
|
Hello BrowserUk,
Thank you for your time and effort, reading and replying to my question. Nice, thanks for the information that I was looking.
But I guess it seems I will stay with simple named functions to my scripts from now on. Unless it is necessary to work differently (rear case).
Seeking for Perl wisdom...on the process of learning...not there...yet!
| [reply] [d/l] [select] |
|
|
I’m sorry, BrowserUK, but I really don’t quite understand the last sentence in that post. There is no eval statement in what remains, and therefore I really don’t know if you meant to retract the assertion that you made in the first sentence of your post. I also don’t quite see the point of your code-sample since it does not appear (as it stands now) to be either complete or runnable.
Ergo, “huh?”
I implicitly assume, always, that “you know whereof you speak,” and especially that you ordinarily work in resource-intensive applications, but I do not grok what you are trying to say here. Maybe it’s the consequence of previous-versions and edits that I never saw?
Please clarify, preferably in a reply to, vs. a rewrite of, the above post. Thank you.
| |
|
|
| [reply] |
Re: Memory efficiency, anonymous vs named's vs local subroutines
by Laurent_R (Canon) on Jul 18, 2015 at 09:51 UTC
|
With today's computers, I do not see any reason to worry about the few bytes or even a few kB that you are going to save with one solution to create your subs compared to another. I think your main aim should be to make your code clear to read and to understand and easy to maintain, don't worry about a few kilobytes memory footprint.
If it came to processing a huge file (say 10 GB, or perhaps even 1 GB)), then it would really make sense to think before choosing the algorithm whether it will store the whole file into memory or whether it will gently iterate over lines or chunks of the file. That's a totally different context, however.
But choosing between between various ways of implementing your subs (named subs, code_refs or anonymous subs, closures, class methods, etc., don't worry about memory usage, this is really irrelevant in most cases.
One slight warning, though: watch out for possible memory leaks and similar problems in your implementation (circular references, perhaps deep recursion, etc.), especially if your program is going to do really a lot of work or to run for a fairly long time. And even then, a memory link is not necessarily dramatic if you can ascertain that it will be only a few kilobytes and are sure that the program will always complete quickly enough so that the leak will never become a problem.
| [reply] |
|
|
Hello Laurent_R,
Thank you for your time and effort, reading and replying to my question. You are right, maintenance comes top in the list. A few Bytes or even KBytes with the resources that we have today will not make any difference. Memory leakage is something that I should be aware and careful for.
My way of moving forward would be using named functions is to understand and to implement. Small pieces of code to construct something bigger.
Seeking for Perl wisdom...on the process of learning...not there...yet!
| [reply] [d/l] [select] |
Re: Memory efficiency, anonymous vs named's vs local subroutines
by shmem (Chancellor) on Jul 18, 2015 at 17:22 UTC
|
Anonymous subs are cheaper in terms of usage (invoking a named sub includes a round-trip to the symbol table) and memory.
Named subs:
use strict; use warnings;
my $begin;
BEGIN { chop( $begin = `ps -o vsz= $$` )}
eval "sub F_$_ {
my( \$a, \$b, \$c ) = \@_;
my \$x = $_;
return \$a * \$b - (\$x + \$c);
}" for 1..1e4;
END {
chop( my $end = `ps -o vsz= $$` );
print "$end - $begin = ",$end - $begin, "\n";
}
__END__
60924 - 22228 = 38696
Anonymous subs:
use strict; use warnings;
my $begin;
BEGIN { chop( $begin = `ps -o vsz= $$` )}
my @ary;
$ary[$_] = eval "sub {
my( \$a, \$b, \$c ) = \@_;
my \$x = $_;
return \$a * \$b - (\$x + \$c);
}" for 1..1e4;
END {
chop( my $end = `ps -o vsz= $$` );
print "$end - $begin = ",$end - $begin, "\n";
}
__END__
53356 - 22228 = 31128
That makes ca. 3.1kB per anonsub, 3.9 per named sub. Slightly different numbers than BrowserUk's above, but also slightly different architecture and version of perl:
perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-gnu-thread-multi
update: The ca. 750 extra bytes for named subs may well be just the cost of allocating a GLOB for the subroutine in the symbol table, which comes wiith SCALAR, HASH, ARRAY, CODE and FILEHANDLE slots. Currently I don't recall whether they are autovivified as needed or allocated in one go.
perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
| [reply] [d/l] [select] |
|
|
sub F123456{ my( $a, $b, $c )= @_; my $x = 123456; return $a * $b - $
+c; };;
*{"X$_"} = \&F123456 for 0 .. 1e6;;
results in an increase in the process size of almost exactly 300MB, thus 300 bytes per STASH entry.
Some of the extra space may be down to unreused (but reusable) space allocated and freed during the doubling of the STASH hash as it grows. I can't think of any way to isolate that.
| [reply] [d/l] |
|
|
use Devel::Peek;
sub foo {
print "in foo\n";
}
*bar = \&foo;
print "----\nfoo:\n----\n";
Dump(*foo);
print "----\nbar\n----\n";
Dump(*bar);
__END__
----
foo:
----
SV = PVGV(0x1765790) at 0x1731550
REFCNT = 3
FLAGS = (RMG,MULTI,IN_PAD)
MAGIC = 0x1732e80
MG_VIRTUAL = &PL_vtbl_backref
MG_TYPE = PERL_MAGIC_backref(<)
MG_OBJ = 0x1725238
NAME = "foo"
NAMELEN = 3
GvSTASH = 0x1705800 "main"
GP = 0x1732870
SV = 0x0
REFCNT = 1
IO = 0x0
FORM = 0x0
AV = 0x0
HV = 0x0
CV = 0x1725238
CVGEN = 0x0
LINE = 4
FILE = "f.pl"
FLAGS = 0xa
EGV = 0x1731550 "foo"
----
bar
----
SV = PVGV(0x17657c0) at 0x1725208
REFCNT = 3
FLAGS = (MULTI,ASSUMECV,IN_PAD)
NAME = "bar"
NAMELEN = 3
GvSTASH = 0x1705800 "main"
GP = 0x1759a40
SV = 0x0
REFCNT = 1
IO = 0x0
FORM = 0x0
AV = 0x0
HV = 0x0
CV = 0x1725238
CVGEN = 0x0
LINE = 5
FILE = "f.pl"
FLAGS = 0xe
EGV = 0x1725208 "bar"
I don't think the sizes reported by Devel::Size are accurate since the stash entry created by coderef assignment is reported as being bigger. Go figure.
use Devel::Size qw(size total_size);
sub foo {
print "in foo\n";
}
print "size foo: ",size(*foo),"\n";
print "total_size foo: ",total_size(*foo),"\n";
print "size bar: ",size(*bar),"\n";
print "total_size bar: ",total_size(*bar),"\n";
print "difference: ",size(*foo) - size(*bar),"\n";
__END__
size foo: 7034
total_size foo: 7034
size bar: 7186
total_size bar: 7186
difference: -152
perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
| [reply] [d/l] [select] |
Re: Memory efficiency, anonymous vs named's vs local subroutines
by Anonymous Monk on Jul 18, 2015 at 13:10 UTC
|
sub foo {
local *bar = sub {
print "other bar\n";
};
print "foo calling quz\n";
quz();
}
sub bar {
print "bar\n";
}
sub quz {
print "quz calling bar\n";
bar();
}
foo();
quz();
__END__
Subroutine main::bar redefined at - line 4.
foo calling quz
quz calling bar
other bar
quz calling bar
bar
Despite the warning about the subroutine being redefined, this can cause some spooky action-at-a-distance effects: How do you know which bar() you'll be calling (as demonstrated by sub quz)?
| [reply] [d/l] [select] |
|
|
| [reply] [d/l] [select] |
Re: Memory efficiency, anonymous vs named's vs local subroutines
by Anonymous Monk on Jul 18, 2015 at 00:45 UTC
|
But why do you want nested subs in the first place? You're not making closures, right? One common way to get "private" helper subs in your module is:
sub something_useful {
my $x = _helper(shift);
return $x;
}
sub _helper {
...
}
One major advantage is in testing: the helper subs can be called directly. | [reply] [d/l] |
|
|
Hello Anonymous,
Thank you for your time and effort, reading and replying to my question.
No I am not planning to use nested subs. I guess a solution like the one you proposed is the most efficient and correct to follow.
Seeking for Perl wisdom...on the process of learning...not there...yet!
| [reply] [d/l] [select] |