Ancient.Wizard has asked for the wisdom of the Perl Monks concerning the following question:

I am attempting to understand why Perl may be incapable of *always* printing the file and line number like the following example. In this simple example and in every Perl .pm or .pl I write seems to work as expected by burping a warning when the condition is detected.

Use of each() on hash after insertion without resetting hash iterator results in undefined behavior, Perl interpreter: 0xed8cd8 at ./each-example.pl line 21.

However, an application I'm coding at work currently has no usage of each(); however while the application runs I randomly received the message above *BUT* without a filename or line number. Figuring Perl wont burp the message unless its real I've combed through the source and found no usage of each(), which is as expected, as I'm not a fond user of it. What I get looks like this:

Use of each() on hash after insertion without resetting hash iterator results in undefined behavior, Perl interpreter: 0x<hex-blah>.

A little back ground: The application runs on Linux, specifically it runs on RH6.7 and uses the Perl RH provides, perl-5.10.1 with various patches RH added via RPM. This version of Perl does not complain about this issue. I've been forced to try and use a newer Perl because the Perl 5.10.1 contains bugs that causes it to received a SIGSEGV (11) because given enough time it will corrupt its heap or something. I attempted to look under Perl's hood by using RH's debug info packages etc and examining core files. It was not useful and the location it pooped out was always random.

I wont bother making a full listing but I've built the latest of every 5.<even>.<latest> release from 5.14.4 and ran regression tests against the application. All of these versions I believe complain about the usage of each(). I wanted to use 5.24.1, thinking I could get support for any issues found however it has a serious memory leak in it. BTW 5.25.<latest> does not the large memory leak. Well I get off topic. Prior to 5.24 version seem to be memory leak free.

I guess the real question is how to determine where this each() warning is coming from? Without a file and line number I'm suspecting this each() is embedded into some XS code that is also compiled Perl. Yes that's rather a huge guess. I'm assuming that if the each() was in my code which is all pure Perl OR in any number of the CORE/CPAN modules being used would include the filename and the line number if it was loaded and compiled at run time. Is this possible and how can I discover where this is happening? It seems I might need to start digging under the hood again and have a Perl with debugging enabled. That sounds real ugly.

The following is not useful except for causing Perl to burp a lot

#!/usr/bin/env perl use strict; use warnings; EXAMPLE: { my $happy_camper = { Rambler => 'The Yellow Rose of Texas', Wagon => 'There may be flies on them there guys but there ar +e no flies on us', Caskit => 'The Rain in Spain goes mainly down the drain', Skate => 'The Quick Brown Fox Jumped Over The Lazy Dogs Back +', Lemon => 'Love the smell of napalm in the morning' }; my $newkeys = 'KEYME000'; foreach ( 1 .. 900000 ) { while ( my ( $_key , $_val ) = each % $happy_camper ) { if ( int(rand 300000) == 1 ) { # printf STDERR "# Adding new item\n"; $happy_camper->{ ++$newkeys } = 'Shame on you!'; #( keys % $happy_camper )[ int(rand scalar keys % $happy_cam +per ) ]; } } # if (( int rand 2000 ) == 0 ) # { # foreach my $_key ( sort keys % $happy_camper ) # { # printf "# %-12s ->> %s\n", $_key, $happy_camper->{ $_key }; # } # } } } exit 0; ## END

Replies are listed 'Best First'.
Re: To each() their own
by choroba (Cardinal) on Apr 30, 2017 at 15:49 UTC
    Your speculation that the warning comes from XS code sounds plausible.

    I'd use the common debugging technique: reduce the code until you don't get the warning anymore. The last part removed is probably responsible for it, so return it back, and try to reduce the code elsewhere. In the end, you should have just several lines of code with a small number of dependencies. Then come back here and ask again, or answer your own question.

    > however it has a serious memory leak in it

    Sounds interesting. Can you give more details? How do you detect memory leaks?

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      The memory leak is already well known to the Perl developers. Apparently the RE compiler and perhaps something in Encode which may just be the same RE issue eats memory like a hungry monster. The tickets on the subject all provide small code samples that will exploit the leak. My application will crap out in an hour or so. I'm on Linux and the application will be killed by the OS once it stresses the system for memory. At this time the RSS should be less than 500MB. The sample memory leak example looks something like this. You wont be able to run it because a helper module I'm using here is unavailable to you.
      $ perl -MSimple::HostStat -MEncode -e 'my $m=Simple::HostStat->new; wh +ile (1) { encode("ascii", substr("test",1)); if ( int(rand(350000)) = += 1 ){printf "# %s\n", $_ for ( $m->toConsole ) }}' # However I think the original (something like this) and a command lik +e top will also due fine $ perl -MEncode -e 'while (1) { encode("ascii", substr("test",1)) }'
        Thanks. The link to the ticket would've been enough.

        ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: To each() their own
by shmem (Chancellor) on Apr 30, 2017 at 15:46 UTC

    update: lousy reading by me, the following tells you what you already know. Skip to next "update:"

    each makes use of the iterator attached to a hash, so it knows which tuple to fetch next. The order of the hash keys is not really random, but randomized linear. While iterating over the hash with each, you change the hash by inserting new tuples. Where does this tuple end up? Farther up or further down the actual position? Should it be incorporated into the list you are iterating over? or exluded? Perl can't tell.

    So the rule is: don't change a set while iterating over it.

    The documentation states:

    If you add or delete a hash's elements while iterating over it, the effect on the iterator is unspecified; for example, entries may be skipped or duplicated--so don't do that. Exception: It is always safe to delete the item most recently returned by each, so the following code works properly:

    ...

    update: did you try adding use Carp; $SIG{__WARN__} = \&Carp::cluck to the code in question? That could give a clue.

    update: nope, Carp::cluck doesn't get triggered. - The warning is in hv.c in Perl_hv_iternext_flags and only gets triggered if perl was compiled with PERL_HASH_RANDOMIZE_KEYS, and it shows up in a statically compiled perl or in libperl.so. It would be very exotic for an XS to roll its own Perl_hv_iternext_flags; but to make sure, you could

    perl -pi.bak -e 's/without resetting hash iterator results/without res +etting HASH iterator results/' libperl.so

    and see if the message sports uppercase HASH in both cases.

    If the each is called from an XS, and while being in XS context, there's no information available about the current file and line number, I'd consider that a bug in the internal error handler. It should report at least the shared library, and perhaps also go uplevel and determine the caller, and report that file and line number.

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
Re: To each() their own
by Marshall (Canon) on May 01, 2017 at 02:30 UTC
    Update: I guess I goofed on the problem. Ooops. Sorry. Since, you don't have an each() loop in the Perl source code, I guess there must be something that is not in your Perl source code causing the error, which would I guess mean something in XS code? I am unable to imagine how this can happen otherwise.

    I took the pragmatic approach and re-coded the loop to use a foreach() loop over (keys %$happy_camper) instead of trying to get the combined key,val each() iterator to work. I haven't had trouble adding keys with a foreach loop like below. Tested on Perl 5.20 MSWin32.

    #!/usr/bin/env perl use strict; use warnings; use Data::Dump qw(pp); EXAMPLE: { my $happy_camper = { Rambler => 'The Yellow Rose of Texas', Wagon => 'There may be flies on them there guys but there ar +e no flies on us', Caskit => 'The Rain in Spain goes mainly down the drain', Skate => 'The Quick Brown Fox Jumped Over The Lazy Dogs Back +', Lemon => 'Love the smell of napalm in the morning' }; my $newkeys = 'KEYME000'; foreach ( 1 .. 900000 ) { foreach my $_key (keys %$happy_camper) { if ( int(rand 300000) == 1 ) { $happy_camper->{ ++$newkeys } = 'Shame on you!'; } } } pp $happy_camper; } exit 0;

      I came to the same conclusion. It seems reasonable that if the each() was inside anyone of many CORE/CPAN modules being used that are *.pm (perl source) the message barked would have include the file and line number. It seems reasonable that because this information is not included it comes form a context that cannot provide it. As XS lives in a apace that could be viewed as the layer that bridges both Perl internals to the outside it stands to reason that some parts of this code must use and deal with Lists, arrays, hashes and other things constructs thus arising to the occasion of doing something that smells like an each().

Re: To each() their own
by kcott (Archbishop) on May 01, 2017 at 06:53 UTC

    G'day Ancient.Wizard,

    Welcome to the Monastery.

    The error is easy enough to generate using the each function, and a modificaton to the hash while iterating; doing so, without each(), is more difficult. I haven't encountered this issue previously; I don't know how to replicate it.

    "I guess the real question is how to determine where this each() warning is coming from?"

    You can turn that warning off, either in a lexically scoped block, or for a portion of sequential statements. perldiag shows that warning as "(S internal)". Keep in mind that turning off 'internal' warnings will get rid of not only your each() warnings but also all other 'internal' warnings.

    # each() (and other 'internal") warnings here { no warnings 'internal'; # No each() (or other 'internal") warnings here } # each() (and other 'internal") warnings here no warnings 'internal'; # No each() (or other 'internal") warnings here use warnings 'internal'; # each() (and other 'internal") warnings here

    While I appreciate that your "application I'm coding at work" is likely to be confidential and not something you can post here; without seeing the code, and being able to replicate the issue, our responses will only be guesswork and conjecture. If you're able to reduce your problem to an SSCCE, the quality of answers may well improve.

    See also: warnings.

    — Ken

      Sharing it would be very difficult for other reasons as well. Its sheer size, roughly 200k lines of source. None of the code I'm bringing to the table uses each(). The lame code example it only useful for showing that the message can be created in those Perl's the detect and bark about the issue. It shows that a when encountered in a file that is complied at run time the file name and line number is output in the bark. Sadly this barking shows no source for it. It's been suggested that I somehow cut back on code until I find the culprit. This does not sound fun. At this point it's only an annoyance, nothing is actually broken in my opinion.
        nothing is actually broken in my opinion.

        Famous last words. If all was sane, there would be no odd behaviour. I think that it is not necessary that the XS code itself uses each. It suffices that somewhere outside a hash reference is iterated, and XS code does something to this hashref passed into the XS which triggers hv_iternext_flags (defined in embed.h) or Perl_hv_iternext_flags perhaps via some macro.

        Lets see...

        qwurx [shmem] .../build/perl-5.25.10> grep hv_iternext_flags *.c *.h hv.c: while ((entry = hv_iternext_flags(ohv, 0))) { hv.c: while ((entry = hv_iternext_flags(ohv, 0))) { hv.c:=for apidoc hv_iternext_flags hv.c:Perl_hv_iternext_flags(pTHX_ HV *hv, I32 flags) hv.c: HE * const he = hv_iternext_flags(hv, 0); hv.c: while ((entry = hv_iternext_flags(hv, HV_ITERNEXT_WANTPLACEHO +LDERS))) { mathoms.c: return hv_iternext_flags(hv, 0); mro_core.c: /* This is partly based on code in hv_iternext_flags. W +e are not call- perl.c: while ((entry = hv_iternext_flags(dups, 0))) { perlmini.c: while ((entry = hv_iternext_flags(dups, 0))) +{ regcomp.c: while ( (temphe = hv_iternext_flags(hv,0)) ) { regcomp.c: while ( (temphe = hv_iternext_flags(hv,0)) ) { embed.h:#define hv_iternext_flags(a,b) Perl_hv_iternext_flags(aTHX_ + a,b) hv.h:/* Flags for hv_iternext_flags. */ hv.h:#define hv_iternext(hv) hv_iternext_flags(hv, 0) proto.h:PERL_CALLCONV HE* Perl_hv_iternext_flags(pTHX_ HV *hv, I32 +flags)

        So hv.c mathoms.c perl.c mro_core.c regcomp.c are possible candidates. Since you also speak of a memory leak loosely coupled with your issue caused by the RE compiler, I'd start looking for code that calls functions from regcomp.c - but that's all just guesswork.

        update:

        This basic XS (created with h2xs -A Foo)

        /* file Foo.xs */ #define PERL_NO_GET_CONTEXT #include "EXTERN.h" #include "perl.h" #include "XSUB.h" #include "ppport.h" MODULE = Foo PACKAGE = Foo SV * foo(hv) HV * hv CODE: hv_store(hv,"newkey", 6, newSVpv("foo", 3),0);

        which just adds/updates a key/value pair to a passed hash reference reports the place where Foo::foo($h) is called:

        use blib; use Foo; use Data::Dump qw(dd); $h = {foo => 1, bar => 2}; while (($k,$v) = each %$h) { Foo::foo($h) if $k eq 'foo'; } dd $h; __END__ Use of each() on hash after insertion without resetting hash iterator +results in undefined behavior, Perl interpreter: 0x224f010 at foo.pl +line 6. { bar => 2, foo => 1, newkey => "foo" }

        Without knowing what perl XS modules you are using inside your application, there's no way to tell where the error would arise from.

        perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
        "Sharing it would be very difficult for other reasons as well. Its sheer size, roughly 200k lines of source."

        That's perfectly valid. Few of us here would be interested in wading through even a few thousand lines of code; certainly not 200,000 lines. In any event, that would far exceed the amount the site would let you post: I think the node limit is 64k characters. However, if you did track it down to some manageable code fragment, and provided that as an SSCCE, you'd be likely to get some positive feedback.

        "It's been suggested that I somehow cut back on code until I find the culprit. This does not sound fun."

        My suggestion of turning off the warnings was intended as an alternative to actually removing chunks of code. The idea would be to turn off warnings in parts of the code: when the warnings disappeared, that would be code you could investigate further. Obviously, without any knowledge of your code, I've no idea how useful or practical that approach might be.

        — Ken

Re: To each() their own
by karlgoethebier (Abbot) on May 01, 2017 at 08:58 UTC

    Slightly recoded (omitting the hash ref) and with a dirty trick (copying the hash) it seems to work:

    #!/usr/bin/env perl use strict; use warnings; use Data::Dump; my %happy_camper = ( Rambler => 'The Yellow Rose of Texas', Wagon => 'There may be flies on them there guys but there are no flies on + us', Caskit => 'The Rain in Spain goes mainly down the drain', Skate => 'The Quick Brown Fox Jumped Over The Lazy Dogs Back', Lemon => 'Love the smell of napalm in the morning' ); my $newkeys = 'KEYME000'; my %copy = %happy_camper; foreach ( 1 .. 900000 ) { while ( my ( $k, $v ) = each %copy ) { if ( int( rand 300000 ) == 1 ) { $happy_camper{ ++$newkeys } = 'Shame on you!'; } } } dd \%happy_camper; __END__ karls-mac-mini:~ karl$ ./strange.pl { Caskit => "The Rain in Spain goes mainly down the drain", KEYME001 => "Shame on you!", KEYME002 => "Shame on you!", KEYME003 => "Shame on you!", KEYME004 => "Shame on you!", KEYME005 => "Shame on you!", KEYME006 => "Shame on you!", KEYME007 => "Shame on you!", KEYME008 => "Shame on you!", KEYME009 => "Shame on you!", KEYME010 => "Shame on you!", Lemon => "Love the smell of napalm in the morning", Rambler => "The Yellow Rose of Texas", Skate => "The Quick Brown Fox Jumped Over The Lazy Dogs Back", Wagon => "There may be flies on them there guys but there are no +flies on us", }

    Regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

    Furthermore I consider that Donald Trump must be impeached as soon as possible

Re: To each() their own
by Ancient.Wizard (Novice) on May 02, 2017 at 19:49 UTC
    BTW: I tried using the $SIG{__WARN__} hook, its works perfectly for any warnings that are from real Perl code; however this mysterious example from an unknown source does not trigger the hook.