comment on

I wanted to match strings containing many fixed-point low-precision numerals, with "text"/whitespace in between. Because of nature of input source, their comparison should be numerically tolerant/fuzzy. I ended with solution which involves programmatically generated regular expressions with (?(?{...})(*F)) per number, i.e. there are many of these in a regex.

With some input, Perl started to segfault and "Out of memory!" on me. Upon investigation, this input happens to have a degenerate case with many thousands of numerals for a single sub-string, for which a regex was created. This sub-string should have been probably excluded from processing in the first place, but I was curios what's going on. Here is SSCE redacted to quite useless no-op:

use strict;
use warnings;
use re 'eval';

STDOUT-> autoflush( 1 );

use constant LEN => 10_000;

my $n = 0;
my $s = '1' x LEN;;
my $r = '(1)(?{ 

#    $^N ? 1 : 1;        # (*) does use of $^N exacerbate?

    print "*" and select            # to better watch
        undef, undef, undef, 0.25   # with htop
            unless ++ $n % 1000     #
})' x LEN;

print "\nMatch\n" if $s =~ /^$r$/;
print "1) Hit Enter"; <>;

( $s = '' ) =~ //;              # reset everything 
                                # about $s and re-engine (?)

$s = ( int rand 10 ) x 1e9;     # allocate another Gb
print "2) Hit Enter"; <>;
[download]

I'm testing with 64-bit Perl and Linux and 8 Gb RAM. With LEN => 10_000, Perl eats ~1 Gb of memory, and apparently sits on it/doesn't free it when it needs more. With 20_000, it's already ~4.5 Gb, and + 1 Gb upon scalar creation (memory is not freed even after re-engine was reset?). With 20_000 and (*) line un-commented, Perl segfaults after 13 stars; it doesn't appear to have consumed all available RAM. With 30_000 and (*) line commented back, it's "panic: memory wrap at (eval 6) line 155489. Attempt to free unreferenced scalar: SV 0x56258a3af7b0, Perl interpreter: 0x562584d66260 at (eval 6) line 155489." after 22 stars.

Arguably, regex with 10_000 of (?{}) is stupid, but I wonder if it indicates slow leak in case of "normal" number of this pattern and long-running process.

In reply to Memory use/leak with large number of (?{}) patterns in regex by vr

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.