How is redefining a sub internally done?

LanX has asked for the wisdom of the Perl Monks concerning the following question:

That's a follow up to Refresh a Module

Consider the following code

 ~ $ perl -w  -e'sub g{1}; my $cr=\&g; eval q(sub g{2}); print g(); pr
+int $cr->();'
Subroutine g redefined at (eval 1) line 1.
21~ $
[download]

As you can see is the sub g dynamically redefined and the calls work.

Looking in the optree reveals that the compiler did some optimisation when compiling the code, and stored the reference \&main::g inside the call. (See OP "e" )

~ $ perl -MO=Concise,-exec -w  -e'sub g{1}; my $cr=\&g; eval q(sub g{2
+}); print g(); print $cr->();'
1  <0> enter v
2  <;> nextstate(main 3 -e:1) v:{
3  <#> gv[IV \&main::g] s
4  <1> rv2cv[t3] lKRM/AMPER,TARG
5  <1> srefgen sK/1
6  <0> padsv[$cr:3,4] sRM*/LVINTRO
7  <2> sassign vKS/2
8  <;> nextstate(main 4 -e:1) v:{
9  <$> const[PV "sub g{2}"] s
a  <1> entereval[t256] vK/1
b  <;> nextstate(main 4 -e:1) v:{
c  <0> pushmark s
d  <0> pushmark s
e  <#> gv[IV \&main::g] s
f  <1> entersub lKS
g  <@> print vK
h  <;> nextstate(main 4 -e:1) v:{
i  <0> pushmark s
j  <0> pushmark s
k  <0> padsv[$cr:3,4] s
l  <1> entersub[t4] lKS/TARG
m  <@> print vK
n  <@> leave[1 ref] vKP/REFC
-e syntax OK
[download]

This reference must be obviously going to the first g(), because the second wasn't known at compile time. But calling the first g() can't be right.

Now, how was Perl able to fix this optimisation?

I did a Devel::Peek of both subs, and couldn't see any data hinting to "forwarding".

Looking up the Symbol table wouldn't make sense, because it's making the optimisation useless

Am I misreading the op-tree and there is no optimisation?

I found an SO thread where Tom Christiansen explicitly states this optimisation is happening.

I'm curious, how is it implemented? What am I missing?

For completeness, here the output from Devel::Peek

~ $ perl -w  -e'sub g{1}; my $cr=\&g; eval q(sub g{2}); print g(); pri
+nt $cr->();use Devel::Peek; Dump($cr);Dump(\&g);'
Subroutine g redefined at (eval 2) line 1.
SV = IV(0xb40000715c6827e8) at 0xb40000715c6827f8
  REFCNT = 1
  FLAGS = (ROK)
  RV = 0xb40000715c682810
  SV = PVCV(0xb40000715c6812d8) at 0xb40000715c682810
    REFCNT = 1
    FLAGS = (DYNFILE)
    COMP_STASH = 0xb40000715c60c6c0     "main"
    START = 0xb40000715c699698 ===> 1
    ROOT = 0xb40000715c699620
    GVGV::GV = 0xb40000715c682858       "main" :: "g"
    FILE = "-e"
    DEPTH = 0
    FLAGS = 0x1000
    OUTSIDE_SEQ = 1
    PADLIST = 0xb40000715c63b480
    PADNAME = 0xb40000715c693bd0(0xb40000715c605510) PAD = 0xb40000715
+c682828(0xb40000715c63b4a0)
    OUTSIDE = 0xb40000715c60c9d8 (MAIN)
SV = IV(0xb40000715c682038) at 0xb40000715c682048
  REFCNT = 1
  FLAGS = (TEMP,ROK)
  RV = 0xb40000715c682078
  SV = PVCV(0xb40000715c6e2548) at 0xb40000715c682078
    REFCNT = 2
    FLAGS = (DYNFILE)
    COMP_STASH = 0xb40000715c60c6c0     "main"
    START = 0xb40000715c699b98 ===> 2
    ROOT = 0xb40000715c699b20
    GVGV::GV = 0xb40000715c682858       "main" :: "g"
    FILE = "(eval 2)"
    DEPTH = 0
    FLAGS = 0x1000
    OUTSIDE_SEQ = 213
    PADLIST = 0xb40000715c63b4e0
    PADNAME = 0xb40000715c693c90(0xb40000715c605670) PAD = 0xb40000715
+c682120(0xb40000715c720260)
    OUTSIDE = 0xb40000715c60c6a8 (UNIQUE)
21~ $
[download]

Cheers Rolf
_{(addicted to the Perl Programming Language :)

see Wikisyntax for the Monastery}

Update

FWIW: The redefine mechanism is related to the symbol table, because if I delete the symbol the full redefine partly fails and the old g() is called when the coderef was stored.

~ $ perl -w  -e'sub g{1}; my $cr=\&g; delete $::{g}; eval q(sub g{2});
+ print g(); print $cr->();'         
11~ $
[download]

Which makes sense, because a symbol can't be redefined if it doesn't exist.

Update

I just discovered that a subroutine which isn't redefined carries a flag NAMED , like

FLAGS = (DYNFILE,NAMED)

so probably this is checked, and if the flag is missing, the symbol table entry becomes the fall back.

Comment on How is redefining a sub internally done? Select or Download Code

Replies are listed 'Best First'.
Re: How is redefining a sub internally done? by ikegami (Patriarch) on Sep 27, 2024 at 14:05 UTC
It says "`gv[IV \&main::g] s`" "gv" means a glob. "\&" means a code ref. Odd. Let's look at `pp_gv`. According to a comment and an assert in `pp_gv`, the attached SV "might be a real GV or might be an RV to a CV". So it could be `*main::g` or `\&main::g`. Does B::Concise distinguish between these, or does it always print `\&main::g`? That's as far as I got.	[reply] [d/l] [select]
Re^2: How is redefining a sub internally done? by dave_the_m (Monsignor) on Sep 30, 2024 at 21:57 UTC
Normally, sub calls are compiled into a typeglob retrieval followed by a retrieval of the CV slot within that typeglob. The GV OP has a pointer to the GV associated with the name hard-baked in at compile time. The OP_GV pushes the GV on the stack, then the OP_ENTERSUB pops thes the GV off the stack, accesses its CV slot, calls the associated CV. However as an optimisation, GVs which only have their CV slot used, are instead created as an RV to a CV. So for example, for `package FOO; sub f { ... } f()` [download] at compile time, the value of the hash entry $FOO::{f} is created as an RV to the CV associated with f, rather than as a full typeglob. When the GV op is compiled, it points to that RV. When the GV op is called, it pushes that RV onto the stack. When the ENTERSUB is is called, it pops that value, notices that it's an RV rather than a GV, and extracts the CV as the thing referenced. When things get more complex, the 'RV to a CV' SV is upgraded to a full GV with the CV in its code slot. An 'RV to CV' is smaller and quicker than a full GV (a GV points to a GP which has a CV slot - so two allocations, two dereferences). Dave.	[reply] [d/l]
Re^2: How is redefining a sub internally done? by LanX (Saint) on Sep 27, 2024 at 15:22 UTC
> Does B::Concise distinguish between these Yes, for instance if a sub is not predefined in the STASH you'll see the `main::g` or rather `g` form. Also in (for me) random cases where it's predefined. Cheers Rolf _{(addicted to the Perl Programming Language :) see Wikisyntax for the Monastery} Update `~/perl $ perl -MO=Concise,-exec -e'g();' 1 <0> enter v 2 <;> nextstate(main 1 -e:1) v:{ 3 <0> pushmark s 4 <#> gv[*g] s/EARLYCV 5 <1> entersub[t2] vKS/TARG 6 <@> leave[1 ref] vKP/REFC -e syntax OK ~/perl $` [download] `s/EARLYCV` means the sub was unknown at compile time.	[reply] [d/l] [select]
Re^3: How is redefining a sub internally done? by ikegami (Patriarch) on Sep 29, 2024 at 16:31 UTC
Inherited subs are cached into the namespace that inherits them. # Causes something akin to # `*Foo::method = \&Base::method;` $foo->method(); [download] But it uses a counter system to invalidate the cache. The package's counter is incremented when the package is changed, making it so the counter in the cached entry no longer matches the package's, invalidating the cached entry. Perhaps that same mechanism is used here.	[reply] [d/l]
Re^4: How is redefining a sub internally done? by LanX (Saint) on Sep 30, 2024 at 13:12 UTC
Re: How is redefining a sub internally done? by Danny (Chaplain) on Sep 26, 2024 at 23:15 UTC
Which makes sense, because a symbol can't be redefined if it doesn't exist. But if you throw in an `undef &g;` before the delete, it prints 22. Why is that?	[reply] [d/l]
Re^2: How is redefining a sub internally done? by LanX (Saint) on Sep 26, 2024 at 23:32 UTC
I can only speculate, like in my last update, that as soon as a sub has "weird" flags, the symbol table acts as fallback. But I couldn't find any informations in the docs, we need to wait for someone knowing the internals. Obviously is deleting a symbol, like already demonstrated, an efficient way to sabotage the whole redefine mechanism. Update Well, I should have added that `undef &g` is changing the sub considerably. Use Devel::Peek with a coderef to see by yourself. Cheers Rolf _{(addicted to the Perl Programming Language :) see Wikisyntax for the Monastery}	[reply] [d/l]